#!/usr/bin/perl -w

=head1 NAME

netcomics - retrieve comics from the Internet

=head1 SYNOPSIS

B<netcomics> S<[B<-lhDvvs>] [B<-p> I<proxy>] [B<-n>,B<-N> I<days>] [B<-c>,B<-C> I<ids>]
[B<-S,-T,-E> I<date>] [B<-d,-m,-t> I<dir>]> [B<-w,-W>[=I<n>] [B<-wt> I<title>]]

=head1 DESCRIPTION

The I<netcomics> program will download today's comic strips from
the Web, and place them in a spool directory where they can be
retrieved for display.  Because each website that carries comic
strips chooses how old of strips to show, the comic strips downloaded
will actually be from different dates.  Most common will be comic
strips that are 1 week, 2 weeks, and 3 weeks old. Also, each website
is not updated at the same time, nor are any of them updated at a
consistent time during the day.  Therefore, when running netcomics
as a cron job in the early morning, you may need to rerun it by hand
a little later in the day to get the comics that it couldn't find.
The exact command to run is given at the end of I<netcomics> output
if any failures occured.

I<netcomics> also supports retrieving specific dates of comics.  A
"Starting Date" may be specified with B<-S>.  An "Ending Date" may be
specified with B<-E>.  And a specific date may be specified with
B<-T>.  All dates are given in the M-D-[YY]YY format.  If a start date
is given without an end date, B<-n> may be used to specify the number
of days of comics to retrieve starting at the specified start date.
If an end date is given without a start date, B<-n> will specify the
number of days of comics to retrieve counting backwards from the end
date.  If a start date is given without B<-n> or B<-E>, then the end
date is assumed to be today's date.  If the option given to B<-E>,
B<-S>, or B<-T> is just an integer, that integer is interpretted as
the number of days prior to the current day (specify specific dates
relatively).

Another way to specify the dates to download comics, is to use B<-N>
in conjunction with B<-n>.  The argument to B<-N> is the number of
days prior to the current day to retrieve comics.  Note that this does
not specify an actual date.  It rather indicates the number of days
ago a comic was made available, rather than the actual date of the
comic.  To see how far behind the date of each comic is from the date
it actually gets released, use B<-l>.  So if today is Wednesday, and you
specify B<-N 2>, I<netcomics> will download the comics that were made
available on Monday.  This is useful for running netcomics in a
timezone that is way ahead of the timezones the comics' websites are
located.

I<netcomics> was created for the purpose of giving your weiry mind a 
little relief from your hectic workday, so another script
called I<display_comics> is also provided as an example way to 
periodically show the retrieved comic strips throughout the workday.

B<Important:>  The further east your timezone is from the US, the later 
in the day you'll have to run I<netcomics>.  As a reference, I suggest
those whose timezone in GMT to wait to run the script at 12:30pm if
they want to get all the comics at a time that's pretty likely to have
had all of the websites updated.  Use B<-N> if you want to run the
script early in the morning and are having problems getting comics to
download. Also, just because a comic failed to download doesn't mean
that the module for that comic is broken--it most likely the website
just hasn't been updated yet.

I<netcomics> can also create an HTML file, "index.html", in the
directory you have the comics placed.   If a number is given with
B<-W> or B<-w>, it will be used to determine the number of comics to
be placed in each html file.  Subsequent files are named "comic#.html".

B<Disclaimer:> Do not put the comics up on the Internet!  You should
only use them for your own use.  Also, do not redistribute the comics
downloaded by I<netcomics> in any other way unless you receive written
authorization from each publisher.

=head1 OPTIONS

=over 4

=item B<-c> I<'comic_ids'>

Get the supplied comics (ids are seperated by white spaces).
This option may be repeated, and may not be used in conjunction with B<-C>.

=item B<-C> I<'comic_ids'>

Don't get the supplied comics (ids are seperated by white spaces).
This option may be repeated, and may not be used in conjunction with B<-c>.

=item B<-d> I<dir>

Put comics into directory. Default is /var/spool/netcomics.

=item B<-D>

Delete files in directory before retreiving.

=item B<-E> [I<date> | I<days>]

Specify an ending date, or the date that is the specified number of
days prior to today, with which to define the range of days of comics
to retrieve.  Must be used in combination with one of B<-S> or B<-n>.
The date is of the form: M-D-Y.

=item B<-f> I<date_fmt>

Specify the date format used when naming files.  Default is C<'%y%m%d'>.

=item B<-h>

Show usage. Comics will not be downloaded.

=item B<-l>

List supported comics & their identifiers. Comics will not be downloaded.

=item B<-m> I<dir>

Add dir to the locations of comic modules. Default is /usr/lib/netcomics.
This option may be repeated to add multiple directories.

=item B<-n> I<days>

Retrieve this number of days of comics, going backwards. Default is 1.
This option may be used in conjunction with B<-N>.

=item B<-N> I<days>

Start retrieving comics this many days before the currently available date.
If you use B<-l> to show the comic id's, the 3rd column indicates the number
of days behind a comic is available.  By default, if B<-E> or B<-S> are
not specified, then netcomics will retrieve each comic, understanding that
the lastest available is that many days ago (according to the number shown
in the 3rd column).  Use this option to push the number of days ago back 
even further.  Default is 0. This option may be used in conjunction with 
B<-n>.  It may not be used in conjuntion with B<-S> or B<-E>.

=item B<-p> I<url>

Specify a URL to use as a proxy.  Both HTTP and FTP are supported.

=item B<-s>

Don't skip bad comics when creating the webpage.  This will potentially
cause the webpage to be loaded into a browser more slowly, but it will
make it evident exactly which websites don't return proper HTTP errors.

=item B<-S> [I<date> | I<days>]

Specify a starting date, or the date that is the specified number of
days prior to today, with which to define the range of days of
comics to retrieve.  May be used in combination with of B<-E> or
B<-n>. The date is of the form: M-D-Y.

=item B<-t> I<dir>

Specify the location of html template files. Default is 
/usr/lib/netcomics/html_tmpl.

=item B<-T> [I<date> | I<days>]

Specify a specific date, or the date that is the specified number of
days prior to today, of comics to retrieve.  This option my be
repeated. The date is of the form: M-D-Y.

=item B<-v>

Be a little verbose.

=item B<-vv>

Be extra verbose

=item B<-w>[=n]

Create an html file, index.html, for the comics downloaded. Optionally,
n specifies the number of comics to have in each page, where subsequent
html files are named comic#.html.

=item B<-wt> I<title>

Specify a title for the webpage rather than the default
("Today's Comics From the Web on E<lt>DATEE<gt>").  This is useful
for when you download specific comics, and want the title of the
webpage to reflect the actual contents.

=item B<-W>[=n]

Recreate the html file, index.html, from the comics that are in the
directory, as well as any new comics downloaded.  Optionally,
n specifies the number of comics to have in each page, where subsequent
html files are named comic#.html.

=back

=head1 EXAMPLES

=over 3

=item 1.

Run as a cron job at 7:30am, Monday through Friday, removing
the previous day's comics beforehand, and creating a webpage.
And for Monday, also retrieve Saturday & Sunday's comics.

   30 07  *  *  2-5  /usr/bin/netcomics -D -w
   30 07  *  *  1    /usr/bin/netcomics -n 3 -D -w


=item 2.

Same as before, except, for Monday, get Saturday's & Sunday's
comics, and for Tuesday, get Monday & Tuesday's.  This is so there
isn't such an overload of comics on Monday.

   30 07  *  *  1    /usr/bin/netcomics -E 1 -n 2 -D -w
   30 07  *  *  2    /usr/bin/netcomics -n 2 -D -w
   30 07  *  *  3-5  /usr/bin/netcomics -D -w

=item 3.

Grab Dilbert & Foxtrot comics from the past 30 days, place them in /tmp,
and create a webpage with a specific title (<DATE> gets replaced with the
name of the month).

   netcomics -c "dilbert ft" -n 30 -d /tmp -w -wt 'Dilbert & \
   Foxtrot Comics From the Month of <DATE FORMAT="%b">'

=item 4.

Specify the date range of comics to retrieve to be from Feb 3, 1999
to Feb 6, 1999, and also get comics on March 3, 1998.

   netcomics -S 2-3-99 -E 2-6-99 -T 3-3-98

=item 5.

Specify the date range of comics to retrieve to be from Jan 6, 1999
and the 5 days before it.  Get all the comics except Jerkcity and Doodie

   netcomics -E 1-6-99 -n 6 -C jc -C doodie

=item 6.

Specify the date range of comics to retrieve to be all those that came
available three, four, and five days ago.

   netcomics -N 3 -n 3

=back

=head1 FILES

=over 29

=item /usr/lib/netcomics

Directory containing the modules that return RLI's.

=item /usr/lib/netcomics/html_tmpl

Directory containing the HTML template files used to create the
webpage.

=item /var/spool/netcomics

Default directory where comics and the webpage are placed.

=item /usr/bin/display_comics

Example script that should be modified to be used to display the
downloaded comics.

=back

=head1 TODO

=over 4

=item 1. 

Have options be able to be supplied smushed together.

=item 2. 

Finish each of the disabled modules.

=item 4.

User .netcomicsrc file to specify options.

=item 5.

Add timezone adjustment logic & include general TZ for each rli.

=item 6.

Implement a better display_comics program.

=item 7.

Find a program that can display animated gifs.

=item 8.

Implement a backup-site scheme.

=item 9.

Add webpage creation option to arrange comics in webpages by 
accumulative size.

=item 10.

Add webpage creation option to specify order of comics in webpages.

=item 11.

Find screensavers that can show the comics.

=item 12.

Add option to delete a comic file if it fails the Image::Size test.

=back

=head1 AUTHOR

Ben Hochstedler <hochstrb@cs.rose-hulman.edu>
ICQ: B<15469308> AIM: B<hochstrb>

=cut

#This script requires the following CPAN packages:
#libwww-perl, HTML-Parser, Image-Size
#
#To add your own comic lib modules, first visit the comic strip resource at:
#http://comics.kurle.com/resource/
#to find the comic you want.

$0 =~ s,.*/,,;  # use basename only

require LWP;
require URI::URL;
require POSIX;
require Image::Size;

use strict;
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Response;
use HTTP::Request::Common;
use POSIX;
use Image::Size 'html_imgsize';

#global vars
my $script_name = "netcomics";
my $webpage = "index.html";
my $files_mode = 0644;
use vars('@lof'); @lof = ();    #list of functions which return an rli hash
use vars('%hof'); %hof = ();    #hash of functions which return an rli hash
use vars('@rli'); @rli = undef; #resource locator information
use vars('$date_fmt'); $date_fmt = "%y%m%d"; #date format in filenames

#default options values
my @libdirs = ("/usr/lib/$script_name");
my $html_tmpl_dir = "$libdirs[0]/html_tmpl";
my $tmpdir = "/var/spool/$script_name";
my $verbose = 0; #default to not verbose
my $extra_verbose = 0; #default to not extra verbose
my $delete_files = 0;
my @userlof = ();
my $do_list_comics = 0;
my $make_webpage = 0; #default to not create the webpage
my $remake_webpage = 0;
my $proxy_url = undef;
my $days_of_comics = undef; #number of days of comics to get, going backwards.
my $days_prior = 0; #number of days prior to start at.
my @dates = (); #dates of comics to get
my ($start_date,$end_date) = (undef,undef);
my $given_options = ""; #used in case there were any failures to help
			#reconstruct the command line options to use.
my $comics_per_page = undef;
my $user_specified_comics = 0;
my $user_unspecified_comics = 0;
my $skip_bad_comics = 1;
my $webpage_title = "Today's Comics From The Web on <DATE>";

STDOUT->autoflush(1);
STDERR->autoflush(1);

#parse command line options
while (@ARGV) {
    $_ = shift(@ARGV);

    #Get specific comics or don't get a specific comics
    if (/-(c)/i) {
	if (@ARGV > 0) {
	    my @ids = split(' ',shift(@ARGV));
	    if ($1 eq 'c') {
                $user_specified_comics = 1;
                push(@userlof,@ids);
            } else {
                $user_unspecified_comics = 1;
                push(@userlof,@ids);
	    }
	} else {
	    print STDERR "Need a space-delimitted list of comic id's.";
	    print STDERR "  Use -h for usage.\n";
	    exit 1;
	}
	if ($user_unspecified_comics && $user_specified_comics) {
	    print STDERR "Can only use one of -c and -C. Use -h for usage.\n";
	    exit 1;
	}
    }
    
    #List supported comics
    elsif (/-l/) {
	$do_list_comics = 1;
    }
    
    #set the directory
    elsif (/-d/) {
	if (@ARGV > 0) {
	    $tmpdir = shift(@ARGV);
	    $given_options .= " -d $tmpdir"
	} else {
	    print STDERR "Need a directory name. Use -h for usage.\n";
	    exit 1;
	}
    }
    
    #verbose
    elsif (/-v$/) {
	$verbose = 1;
	$given_options .= " -v";
    }
    
    #Extra Verbosity
    elsif (/-vv/) {
	$extra_verbose = 1;
	$verbose = 1;
	$given_options .= " -vv";
    }
    
    #Delete the files
    elsif (/-D/) {
	$delete_files = 1;
	$given_options .= " -D";
    }

    #Create webpage?
    elsif (/-([wW])\=?(\d+)?$/) {
	$comics_per_page = $2 if defined($2) && $2 > 0;
 	$make_webpage = 1;
	$remake_webpage = 1 if $1 =~ /W/;
    }

    #Use a Proxy?
    elsif (/-p/) {
	if (@ARGV > 0) {
	    $proxy_url = shift(@ARGV);
	    $given_options .= " -p $proxy_url";
	} else {
	    print STDERR "Need a URL to use as the proxy. " .
		"Use -h for usage.\n";
	    exit 1;
	}
    }

    #Number of days of comics to get, going backwards
    elsif (/-n/) {
	if (@ARGV > 0) {
	    $days_of_comics = shift(@ARGV);
	    $given_options .= " -n $days_of_comics";
	} else {
	    print STDERR "Need a number for an argument to -n. " .
		"Use -h for usage.\n";
	    exit 1;
	}
    }

    #HTML Template Directory
    elsif (/-t/) {
	if (@ARGV > 0) {
	    $html_tmpl_dir = shift(@ARGV);
	    $given_options .= " -t $html_tmpl_dir";
	} else {
	    print STDERR "Need a directory for an argument to -t. " .
		"Use -h for usage.\n";
	    exit 1;
	}
    }

    #Comic Module Directory
    elsif (/-m/) {
	if (@ARGV > 0) {
	    my $dir = shift(@ARGV);
	    push(@libdirs,$dir);
	    $given_options .= " -m $dir";
	} else {
	    print STDERR "Need a directory for an argument to -m. " .
		"Use -h for usage.\n";
	    exit 1;
	}
    }

    #Specified date
    elsif (/-([STE])/) {
	my $type = $1;
	my $good = 0;
	if (@ARGV > 0) {
	    my $ds = shift(@ARGV);
	    my $ts = undef;
	    if ($ds =~ /([0-1]?[0-9])-([0-3]?[0-9])-([12]?[09]?[7890][0-9])/) {
		$ts = mktime(0,0,12,$2,$1-1,$3);
	    } elsif ($ds =~ /([+-]?\d+)/) {
		$ts = time - ($1 * 24*3600);
	    }
	    if (defined($ts)) {
		$given_options .= " -$type $ds";
		$good = 1;
		$_ = $type;
		if (/T/) {
		    push(@dates,$ts);
		} elsif (/S/) {
		    $start_date = $ts;
		} elsif (/E/) {
		    $end_date = $ts;
		}
	    }
	}
	
	unless ($good) {
	    print STDERR "Need a date for an argument to -$type. ";
	    print STDERR "It has the form: MM-DD-[YY]YY\nOr it is the number";
	    print STDERR "of days prior to day to specify the date.\n";
	    exit 1;
	}
    }

    #don't skip bad comics
    elsif (/-s/) {
	$skip_bad_comics = 0;
	$given_options .= " -ss";
    }

    #webpage title
    elsif (/-wt$/) {
	if (@ARGV > 0) {
	    $webpage_title = shift(@ARGV);
	    $given_options .= " -wt '$webpage_title'";
	} else {
	    print STDERR "Need a string for an argument to -wt. " .
		"Use -h for usage.\n";
	    exit 1;
	}
    }

    #date format used when naming files
    elsif (/-f/) {
	if (@ARGV > 0) {
	    $date_fmt = shift(@ARGV);
	    $given_options .= " -d '$date_fmt'";
	} else {
	    print STDERR "Need a date format for an argument to -f. " .
		"Use -h for usage.\n";
	    exit 1;
	}
    }

    #Number of days of comics to get, going backwards
    elsif (/-N/) {
	if (@ARGV > 0) {
	    $days_prior = shift(@ARGV);
	    $given_options .= " -N $days_prior";
	} else {
	    print STDERR "Need a number for an argument to -N. " .
		"Use -h for usage.\n";
	    exit 1;
	}
    }

    #Usage
    else {
	usage();
    }

}

#check to make sure proper sets of options were provided
if (defined($end_date)) {
    if (!defined($start_date)) {
	if (!defined($days_of_comics)) {
	    print STDERR "A starting date or the number of days of comics to ";
	    print STDERR "retrieve must be provided\nwhen an ending date is. ";
	    print STDERR "Use -h for usage.\n";
	    exit 1;
	}
    } elsif ($start_date > $end_date) {
	print STDERR "The starting date must be before the ending date. ";
	print STDERR "Use -h for usage.\n";
	exit 1;
    } elsif (defined($days_of_comics)) {
	print STDERR "The number of days of comics to retrieve may not be ";
	print STDERR "specified when the starting and\nending dates are. ";
	print STDERR "Use -h for usage.\n";
	exit 1;
    }
}
if (defined($days_prior) && (defined($end_date) || defined($start_date))) {
    print STDERR "-N may not be specified with -S or -E.  Use -h for usage";
}

load_modules(@libdirs);
if ($do_list_comics) {
    my %tmphof = ();
    foreach (@lof) {
	$tmphof{$_} = "?";
    }
    list_comics(%hof,%tmphof);
}

#make sure user specified existing functions
#and set the @lof & %hof to those the user specified
if ($user_specified_comics || $user_unspecified_comics) {
    my @new_lof = ();
    my %new_hof = ();
    my @bad_lof = ();
    my @hof_keys = keys(%hof);
    @hof_keys = () unless defined(@hof_keys);
    my $fun;
    foreach $fun (@userlof) {
	my @hres = grep(/^$fun$/,@hof_keys);
	if (@hres > 0) {
	    $new_hof{$fun} = $hof{$fun};
	} else {
	    my @lres = grep(/^$fun$/,@lof);
	    if (@lres > 0) {
		push(@new_lof,$fun);
	    } else {
		push(@bad_lof,$fun);
	    }
	}
    }
    if (@bad_lof > 0) {
	print STDERR "No such comics: \"@bad_lof\". ";
	print STDERR "Use -l to see the list of comics.\n";
	exit 1;
    }
    if ($user_specified_comics) {
	@lof = @new_lof;
	%hof = %new_hof;
    } else {
	#intersection
	my @new = ();
	foreach (@lof) {
	    if (grep(/^${_}$/,@new_lof) > 0) {
		push(@new,$_);
	    }
	}
	@lof = @new;
	foreach (keys %new_hof) {
	    delete $hof{$_};
	}
    }
}

#Make sure the temp dir exists
unless (-d $tmpdir) {
    mkdir($tmpdir,0777) || die "could not create $tmpdir: $!"; 
} elsif ($delete_files) {
    chdir $tmpdir || die "could not cd to $tmpdir: $!";
    unlink <*.*>;
}


#Do the work.
my $get_current = build_date_array();
if ($extra_verbose) {
    print "dates: ";
    my $date;
    foreach $date (@dates) {
	print strftime("%m-%d-%y",localtime($date));
	print " ";
    }
    print "\n";
}
build_rli_array($get_current);
my @comics=();

push(@comics,get_comics());

if ($remake_webpage) {
    print "Reading $tmpdir to get list of current comics\n" if $extra_verbose;
    opendir(DIR,$tmpdir) || die "could not open $tmpdir: $!";
    my @files = readdir(DIR);
    closedir(DIR);
    @comics=grep(/(gif|je?pg|tiff?|png)$/,@files);
}

create_webpage(@comics) if $make_webpage;


#Build the array of dates of comics to get
sub build_date_array {
    $days_of_comics = undef 
	if defined($days_of_comics) && $days_of_comics == 0;
    if (defined($days_of_comics)) {
	#incase user specified it with a minus sign.
	$days_of_comics = abs($days_of_comics);
	$days_of_comics--; #adjust for 0-base
    }

    my $get_current = 0; #use hof & lof or just hof
    #Determine the start & end dates
    if (! defined($end_date) && ! defined($days_of_comics) &&
	defined($start_date)) {
	#-S, no -E, no -n
	$end_date = time;
    } elsif (defined($end_date) && defined($days_of_comics) &&
	     ! defined($start_date)) {
	#no -S, -E, -n
	$start_date = $end_date - ($days_of_comics * 24*3600);
    } elsif (! defined($end_date) && defined($days_of_comics) &&
	     defined($start_date)) {
	#-S, no -E, -n
	$end_date = $start_date + ($days_of_comics * 24*3600);
    } elsif (! defined($end_date) && defined($days_of_comics) && 
	     ! defined($start_date)) {
	#no -S, no -E, -n
	$get_current = 1;
	$end_date = time - ($days_prior * 24*3600);
	$start_date = $end_date - ($days_of_comics * 24*3600);
    } elsif (! defined($end_date) && ! defined($days_of_comics) && 
	     ! defined($start_date)) {
	#no -S, no -E, no -n
	#we don't need to do anything special
	#add today's date to the date array, and return.
	if (@dates == 0) {
	    push(@dates,time);
	    $get_current = 1;
	}
	return $get_current;
    }

    {   #Build up the date array
	my $time_c = $start_date;
	my @e_day = localtime($end_date);
	my @c_day = localtime($time_c);
	my $e_day = strftime("%Y%m%d",@e_day);
	my $c_day = strftime("%Y%m%d",@c_day);
	while ($c_day <= $e_day) {
	    push(@dates,$time_c);
	    $time_c += 24*3600;
	    @c_day = localtime($time_c);
	    $c_day = strftime("%Y%m%d",@c_day);
	}
    }
    return $get_current;
}

#Build up the list of resource locators
sub build_rli_array {
    my $get_current = shift; #accomodate the time for the rli hash function?
    my $i=0;
    my $time;
    if ($get_current) {
	print "\nAdding lof & get_current RLI's: " if $extra_verbose;
	my $fun;
	foreach $fun (@lof) {
	    print "$fun " if $extra_verbose;
	    foreach $time (@dates) {
		#get the info from the RLI
		$rli[$i] = eval "$fun $time";
		#remember which RLI that was
		$rli[$i++]{'proc'} = $fun if defined $rli[$i];
	    }
	}
	print "\nAdding hof & get_current RLI's: " if $extra_verbose;
	#accomodate the time for the rli hash functions
	my $days;
	while (($fun,$days) = each %hof) {
	    print "$fun " if $extra_verbose;
	    foreach $time (@dates) {
		my $adjtime = $time - ($days * 24*3600);
		#get the info from the RLI
		$rli[$i] = eval "$fun $adjtime";
		#remember which RLI that was
		$rli[$i++]{'proc'} = $fun if defined $rli[$i];
	    }
	}
    } else {
	print "\nAdding hof & !get_current RLI's: " if $extra_verbose;
	my $fun;
	foreach $fun (keys %hof) {
	    print "$fun " if $extra_verbose;
	    foreach $time (@dates) {
		#get the info from the RLI
		$rli[$i] = eval "$fun $time";
		#remember which RLI that was
		$rli[$i++]{'proc'} = $fun if defined $rli[$i];
	    }
	}
    }
    print "\n" if $extra_verbose;
}


#get the comics
sub get_comics {
    #Go through the list of RL's and get the comic at each one.
    my $ua = LWP::UserAgent->new;
    if (defined $proxy_url) {
	print "using proxy, $proxy_url ...\n" if $verbose;
	$ua->proxy(['http', 'ftp'], $proxy_url);
    }
    my $response = undef;
    my @images = (); #list of comics successfully downloaded
    my @bad_images = (); #list of comic ids that had problems
    my $i;
    RLI: for $i ($[ .. $#rli) {
	next if ! defined $rli[$i];
	my $rli = $rli[$i]; #keep from having to use [$i] all the time
	my $proc = $rli->{'proc'};
	my $name = $rli->{'name'};
	my $base = $rli->{'base'};
	my ($page,$expr,$exprs,$func) = (undef,undef,undef,undef);
	$page = $rli->{'page'} if defined $rli->{'page'};
	$expr = $rli->{'expr'} if defined $rli->{'expr'};
	$exprs = $rli->{'exprs'} if defined $rli->{'exprs'};
	$func = $rli->{'func'} if defined $rli->{'func'};

	my $file = "$tmpdir/$name";
	print "$name\n" if $verbose && ! $extra_verbose; 
	
        #handle backwards compatibility
	if (defined($exprs) && defined($expr)) {
	    print STDERR "Both exprs & expr are defined in the rli returned";
	    print STDERR " by $proc.  Please use only one. Skipping\n";
	    next;
	}
	if (defined($page)) {
	    if (defined($expr)) {
		#build up the exprs array
		$exprs = [$expr];
	    }
	} elsif (defined($expr)) {
	    #set page to $expr
	    $page = $expr;
	} elsif (defined($exprs)) {
	    $page = "";
	} elsif (defined($func)) {
	    #good.
	} else {
	    print STDERR "func, exprs, page, nor expr are defined in the rli ";
	    print STDERR "returned by $proc. Please use at least one of them.";
	    print STDERR " Skipping\n";
	    next;
	}

	my $i = 0; #number of the URL gotten
	if (defined($page)) {
	    my $url = "$base$page";
	    print "$name($i): $url\n" if $extra_verbose;
	    $response = $ua->request(GET $url);
	    unless ($response->is_success) {
		print STDERR "failure fetching '$url' for $name ($i).\n";
		push(@bad_images,$proc) unless grep {/^$proc$/} @bad_images;
		next;
	    }
	    $i++;

	    my $exp = "";
	    foreach $exp (@$exprs) {
		#get the location of the image in the html page just returned.
		$_ = $response->content;
		#match on the content as if it were a single line (/$exp/s)
		unless (eval(/$exp/s)) {
		    print STDERR "failed to match against '$exp' ($i) for ";
		    print STDERR "$name in $url.\n";
		    push(@bad_images,$proc) 
			unless grep {/^$proc$/} @bad_images;
		    next RLI;
		}
		$url = "$base$1";
		
		#get the next URL
		print "$name($i): expr = '$exp'; URL = $url\n" 
		    if $extra_verbose;
		$response = $ua->request(GET $url);
		unless ($response->is_success) {
		    print STDERR "failure fetching '$url' for $name ($i).\n";
		    push(@bad_images,$proc) 
			unless grep {/^$proc$/} @bad_images;
		    next RLI;
		}
		$i++; #simply keep track for debugging purposes
	    }
	}

	#handle function returning relative URLs
	if ($func) {
	    print "$name: applying function\n" if $extra_verbose;

	    #run the function, giving it the last response downloaded if any
	    my @relurls;
	    if (defined($page)) {
		@relurls = &$func($response->content);
	    } else {
		@relurls = &$func();
	    }		
	    
	    print "$name($i): function returned no relative urls.\n" 
		if @relurls == 0 && $extra_verbose;

	    my $j = 0;
	    foreach (@relurls) {
		my $url = "$base$_";
		$j++; #used to append to the file name
		
		my $mname = $name;
		$mname =~ s/^(.*)\.([^\.]*)$/$1-$j.$2/ if @relurls > 1;

		print "$mname($i): $url\n" if $extra_verbose;
		$response = $ua->request(GET $url);
		unless ($response->is_success) {
		    print STDERR "failure fetching '$url' for $mname ($i).\n";
		    push(@bad_images,$proc) 
			unless grep {/^$proc$/} @bad_images;
		}
		file_write("$tmpdir/$mname", $files_mode, $response->content);
		push @images,$mname;
		$i++; #simply keep track for debugging purposes
	    }
	} else {
	    #save the one image
	    file_write($file, $files_mode, $response->content);
	    push @images,$name;
	}
	
    }
    
    print "\nImages retrieved: @images\n" if $extra_verbose;
    if (@bad_images > 0) {
	print "To try retrieving the images that failed, run this command:\n";
	print "$script_name -c \"@bad_images\"";
	print " -n $days_of_comics" if ++$days_of_comics > 1;
	print " -W" if $make_webpage;
	print $given_options;
	print "\n";
	print "Please, before sending in a bug report on a comic that doesn't";
	print " download,\ntry over a period of several days (or weeks, ";
	print "depending on the problem) to\nsee if it just happened to be ";
	print "that the website maintainer for that comic\n";
	print "didn't update the comic promptly.\n";
    }
    return @images;
}

sub create_webpage {
    my $comics = shift(@_);

    print "Deleting old webpage(s)\n" if $extra_verbose;
    chdir $tmpdir;
    unlink <index.html>;
    unlink <comic*.html>;

    print "Creating the webpage(s)\n" if $extra_verbose;
    my $fname_tmpl = "comics<NUM>.html";
    my $head_tmpl=file_read("$html_tmpl_dir/head.html");
    my $body_el_tmpl=file_read("$html_tmpl_dir/body_el.html");
    my $links_tmpl=file_read("$html_tmpl_dir/links.html");
    my $tail_tmpl=file_read("$html_tmpl_dir/tail.html");

    my $time = time();
    my $datestr = strftime("%b %d, %Y",localtime($time));
    my $ctime = ctime($time);
    $head_tmpl =~ s/<PAGETITLE>/$webpage_title/g;
    $head_tmpl =~ s/<DATE>/$datestr/g;
    while ($head_tmpl =~ /<DATE FORMAT="([^"]*)">/) {
	my $datestr = strftime($1,localtime($time)); 
	$head_tmpl =~ s/<DATE FORMAT="([^"]*)">/$datestr/;
    }
    $tail_tmpl =~ s/<CTIME>/$ctime/g;

    my @sorted_comics = sort(@comics);
    $comics_per_page = @comics unless defined($comics_per_page); 
    my $num_groups = @comics / $comics_per_page;
    $num_groups =~ s/^(\d+)\.?\d*$/$1/;
    my $comics_on_last = @comics % $comics_per_page;
    $num_groups++ if $comics_on_last > 0;
    print "number of groups    = $num_groups\n" .
	"comics per page     = $comics_per_page\n" .
	"comics on last page = $comics_on_last\n"
	if $extra_verbose;
    my $i = -1;
    while (++$i < $num_groups) {
	my $group = $i + 1;
	my $first = $i * $comics_per_page + 1;
	my $last  = $first + $comics_per_page - 1;
	$last = $first + $comics_on_last -1 
	    if ($group == $num_groups && $comics_on_last > 0);
	my $filename = "$tmpdir/$fname_tmpl";
	my $prevfile = $fname_tmpl;
	my $nextfile = $fname_tmpl;
	my $prevgroup = $group -1;
	my $nextgroup = $group +1;
	$nextfile =~ s/<NUM>/$nextgroup/g;
	if ($group == 1) {
	    $filename = "$tmpdir/$webpage";
	    $prevfile =~ s/<NUM>/$num_groups/g;
	    $prevgroup = $num_groups;
	} else {
	    $filename =~ s/<NUM>/$group/g;
	    $prevfile =~ s/<NUM>/$prevgroup/g;
	    if ($group == $num_groups) {
		$nextfile = $webpage;
	    }
	}
	$prevfile = $webpage if $group == 2;

	print "\nCreating $filename ($first to $last)\n" if $extra_verbose;

	my $head = $head_tmpl;
	my $links = "";
	my $body ="";
	my $tail = $tail_tmpl;
	$head =~ s/<NUM=FIRST>/$first/g;
	$head =~ s/<NUM=LAST>/$last/g;
	if ($num_groups > 1) {
	    $links = $links_tmpl;
	    $links =~ s/<FILE=PREV>/$prevfile/g;
	    $links =~ s/<FILE=NEXT>/$nextfile/g;
	    $links =~ s/<NUM>/$comics_per_page/g;
	}
	my $comic;
	foreach $comic (@sorted_comics[($first-1)..($last-1)]) {
	    my $size = html_imgsize("$tmpdir/$comic");
	    unless (defined($size)) {
		if ($skip_bad_comics) {
		    next;
		} else {
		    $size = "";
		}
	    }
	    my $body_el = $body_el_tmpl;
	    my $name = name_from_filename($comic);
	    print "$name " if $extra_verbose;
	    $body_el =~ s/<COMIC_FILE>/$comic/g;
	    $body_el =~ s/<COMIC_NAME>/$name/g;
	    $body_el =~ s/<SIZE>/$size/g;
	    $body .= $body_el;
	}
	print "\n" if $extra_verbose;
	file_write($filename,$files_mode,"$head$links$body$links$tail");
    }

}

sub syscmd {
    my $cmd = shift(@_);
    print "$cmd\n";
    return system($cmd);
}

sub file_write {
    my $file = shift(@_);
    my $mode = shift(@_);
    my $exists = -f $file;
    open(F, ">$file") || die "Could not open \"$file\". $!";
    binmode(F);
    print(F @_);
    close(F);
    unless ($exists) {
	chmod($mode, $file) || die "Could not set \"$file\"'s permissions. $!";
    }
}

sub file_read {
    local $/;
    my $file = shift;
    open(F, "<$file")       || die "Could not open \"$file\". $\n";
    binmode(F);
    $/ = undef; #input record separator
    my $text = <F>;
    close(F);
    return $text;
}

#args: comic_filename & optionally boolean to tell it to produce date or not
sub name_from_filename {
    my $name = shift;
    my $do_date = (@_ > 0) ? shift : 1;
    $name =~ s/(_)/ /g;
    if ($name =~ /(.*)-(\d?\d?\d\d)(\d\d)(\d\d)(-\d+)?\.\w+/) {
	$name = "$1";
	$name .= ", $5" if defined($5);
	if ($do_date) {
	    my ($year,$month,$day) = ($2,$3,$4);
	    $year =~ s/^(19|20)//;
	    $month--;
	    my @ltime = localtime(mktime(0,0,0,$day,$month,$year));
	    my $mday = $ltime[3];
	    $name .= strftime(" (%a, %h $mday, %Y)",@ltime);
	}
    }
    return $name;
}

sub list_comics {
    my %hof = @_;
    my $today = time;
    my %names; #indexed by the comic name (not function) to make sorting easy
    my ($max_flen,$max_nlen) = (0,0,0);
    my ($f,$d);
    while (($f,$d) = each %hof) {
	my $i = -1;
	my $rh = undef;
	while ($i++ <= 21) {
	    #try to get a real name from the rli function 21 times
	    my $time = $today - (($d + $i) * 24*3600);
	    $rh = eval "$f $time";
	    last if defined($rh);
	}
	my $name = $f;
	$name = name_from_filename($rh->{'name'},0) if (defined($rh));
	$names{$name} = [$f,$d];
	my $len = length($f);
	$max_flen = $len +2 if $len > $max_flen;
	$len = length($name);
	$max_nlen = $len +1 if $len > $max_nlen;
    }
    my @names = sort(keys %names);
    my $title = 'Comic Name';
    my $lines = '----------';
    $names{$title} = ["id","days behind"];
    $names{$lines} = ["--","-----------"];

    print "<HTML><TITLE>Supported Comics</TITLE>\n</HEAD>\n<BODY>\n<TABLE>\n" 
	if $make_webpage;

    my $name;
    foreach $name ($title,$lines,@names) {
	my ($f,$d) = @{$names{$name}};
	print "<TR><TD>" if $make_webpage;
	print "$f";
	my $len = $max_flen - length($f);
	if ($make_webpage) {
	    print "</TD><TD>";
	} else {
	    print " " x $len;
	}
	print "$name";
	$len = $max_nlen - length($name);
	if ($make_webpage) {
	    print "</TD><TD>";
	} else {
	    print " " x $len;
	}
	print "$d\n";
	print "</TD></TR>\n" if $make_webpage;
    }
    print "</TABLE>\n</BODY>\n<HTML>\n" if $make_webpage;
    exit 0;
}

sub load_modules {
    my @libdirs = @_;
    push(@INC,@libdirs);
    my $libdir;
    foreach $libdir (@libdirs) {
	if (opendir(DIR, $libdir)) {
	    my @modules = grep { /[^~]$/ && -f "$libdir/$_" } readdir(DIR);
	    closedir DIR;
	    if ($extra_verbose) {
		print "Loading modules in $libdir: ";
	    }
	    my $module;
	    foreach $module (@modules) {
		print "$module " if $extra_verbose;
		require $module if (-r "$libdir/$module");
	    }
	    if ($extra_verbose) {
		print "\n";
	    }
	} else {
	    print STDERR "$libdir is not accessible. Skipping it.\n";
	}
    }
}

sub usage
{
    print "$script_name : download comics from the Web.\n";
    print << "END";
(c)1999 Ben Hochstedler <hochstrb\@cs.rose-hulman.edu>
usage: netcomics [-Dhlvvs] [-p proxy] [-S,-T,-E date] [-n,-N days] [-W,-w[=n]]
                 [-d,-m,-t dir] [-c,-C "comic ids"] [-f date_fmt]
   -c: get the listed comics (ids are seperated by white spaces)
   -C: don't get the listed comics (ids are seperated by white spaces)
   -d: put comics into directory (default /var/spool/$script_name)
   -D: delete files in directory before retreiving
   -E: specify the ending date of a range of days of comics to retrieve
   -f: specify the date format used when naming files. default: '%y%m%d'
   -h: show usage (doesn't download comics)
   -l: list supported comics & their identifiers (doesn't download comics)
   -m: add dir to locations of comic modules (default /usr/lib/$script_name)
   -n: retrieve this number of days of comics, going backwards. default is 1
   -N: retrieve this number of days prior to the currently available date
   -p: use the given url as the proxy
   -s: don't skip bad comics when creating the webpage
   -S: specify a starting date of a range of days of comics to retrieve
   -t: location of html tmpl files (default /usr/lib/$script_name/html_tmpl)
   -T: specify a specific date of comics to retrieve
   -v: be a little verbose
   -vv:be extra verbose
   -w: create an html file, index.html, n comics per page
   -wt:specify a title for the webpage rather than the default
   -W: recreate webpage from all files in directory, n comics per page
END
    exit 0;
}

