DOS_PERL.TXT John Dallman, jgd@cix.compulink.co.uk v2.00 - 27th May 1996 This document provides some general notes on programming with Perl under MS-DOS. They're based on my experience with various ports of Perl 3 and 4. dds Didi Spenellis' 16-bit Perl 3.041 eva Eelco Van Asparen's 16-bit Perl 4.036 big Daryl Orzra's 32-bit Perl 4.036, (aka BigPerl) M03 This (or a later version) is the one to get. Disclaimers =========== I am not the author of MS-DOS Perl. Someone, somewhere has been telling people that I am. This is false. I do not now, nor have I ever, programmed for CGI in Perl. I do not know how to do it, and I am in no hurry to learn. If you want to set up a money-making web site on your home PC, I am not the man to help you. Buy a decent book instead: your local bookstore should have plenty. Nor do I know very much about Windows 95. The Windows NT version of Perl runs under windows 95, and seems to be the generally used version for that platform. It will *not* run under Win32S on Windows 3.x. The World of Perl ================= Perl doesn't have a built-in editor or a visual development environment. If you want one, you can always write it. You'll develop Perl at the DOS prompt, using a text editor and lots of "perl myprog.pl" commands. To learn Perl, you'll need a Perl system and a book. Books are listed below; copies of Perl are available on the Internet (see below), or from shareware suppliers. Usenet access is very helpful if you use Perl: the Usenet newsgroup comp.lang.perl.misc is the world's leading source of help and advice on Perl. Please *try* to find answers to questions in this document, the comp.lang.perl.misc Frequently Asked Questions (FAQ) files or a book before asking questions - Dallman's first law of Usenet* applies strongly on comp.lang.perl.misc. Perl has lots of uses apart from fancy WWW pages. The Perl gurus on comp.lang.perl.misc have grown impatient, to put it mildly, with people who think that the group exists to tell them how to set up their WWW site. Asking people to write the scripts for your web site for you does not work: you need to find someone who sells them, or learn to write your own. The newgroup will help you with specific questions about the programming language, but won't do the work for you. The correct newsgroup for questions about CGI is comp.infosystems.www.authoring.cgi. [*Well, since you ask: 'Asking questions you don't understand produces answers you can't understand'] The Standard Questions: ======================= 1) As of May 1996, there isn't a version of Perl for MS-Windows 3.x. BigPerl can run in a Windows DOS box, but has no 'visual' user interface under these conditions. 2) Perl version 5 is not available for DOS: the author appears to have abandoned the project. 3) There is no undump for MS-DOS, and can't be. See below for more details. Books About Perl ================ Learning Perl by Randal L. Schwartz aka "The Llama Book" (engraving of a llama on cover) published by O'Reilly & Associates ISBN 1-56592-042-2 Tutorial-style introduction to Perl. Programming Perl by Larry Wall and Randal L. Schwartz aka "The Camel Book" (engraving of a camel on cover) published by O'Reilly & Associates ISBN 0-937175-61-1 Tutorial, reference, cookbook, examples, ... There are several others, but these seem to be the good ones. An update for the Camel book to cover Perl 5 is due in late summer 1996. Perl sources on the Internet ============================ The comp.lang.perl.misc FAQ is not currently up to date. The best staring point to find stuff via the WWW is mox.perl.com, the WWW site maintained by Tom Christiansen. The Perl newsgroups are: comp.lang.perl.announce, which only carries announcements, and has *no* discussion. comp.lang.perl.misc, which has general discussion of the language and its applications. comp.lang.perl.tk, which deals with the Tk extension to Perl. This is used with the X-Windows system under Unix, and is not relevant to MS-DOS users. The newsgroups that deal with WWW and CGI authoring issues are the comp.infosystems.www set, notably comp.infosystems.www.authoring.cgi and comp.infosystems.www.authoring.html. For copies of Perl, libraries to use with it, and so on, there is a network of FTP archive sites, all of which store the same files (known as 'mirrors'). The list and details below were lifted from the comp.lang.perl.tk FAQ file. Perl stuff for MS-DOS will be found in the directory .../CPAN/ports/msdos. Updated: Thu May 16 01:13:37 EDT 1996 Africa South Africa ftp://ftp.is.co.za/programming/perl/CPAN/ Asia Hong Kong ftp://ftp.hkstar.com/pub/CPAN/ Japan ftp://ftp.lab.kdd.co.jp/lang/perl/CPAN/ Taiwan ftp://dongpo.math.ncu.edu.tw/perl/CPAN/ Australasia Australia ftp://coombs.anu.edu.au/pub/perl/CPAN/ ftp://ftp.mame.mu.oz.au/pub/perl/CPAN/ New Zealand ftp://ftp.tekotago.ac.nz/pub/perl/CPAN/ Europe Austria ftp://ftp.tuwien.ac.at/pub/languages/perl/CPAN/ Czech Republic ftp://sunsite.mff.cuni.cz/MIRRORS/ftp.funet.fi/pub/languages/perl/CPAN/ Denmark ftp://sunsite.auc.dk/pub/languages/perl/CPAN/ Finland ftp://ftp.funet.fi/pub/languages/perl/CPAN/ France ftp://ftp.ibp.fr/pub/perl/CPAN/ ftp://ftp.pasteur.fr/pub/computing/unix/perl/CPAN/ Germany ftp://ftp.leo.org/pub/comp/programming/languages/perl/CPAN/ ftp://ftp.rz.ruhr-uni-bochum.de/pub/CPAN/ Greece ftp://ftp.ntua.gr/pub/lang/perl/ Hungary ftp://ftp.kfki.hu/pub/packages/perl/CPAN/ the Netherlands ftp://ftp.cs.ruu.nl/pub/PERL/CPAN/ Poland ftp://ftp.pk.edu.pl/pub/lang/perl/CPAN/ ftp://sunsite.icm.edu.pl/pub/CPAN/ Portugal ftp://ftp.ci.uminho.pt/pub/lang/perl/ Slovenia ftp://ftp.arnes.si/software/perl/CPAN/ Spain ftp://ftp.rediris.es/mirror/CPAN/ Sweden ftp://ftp.sunet.se/pub/lang/perl/CPAN/ Switzerland ftp://ftp.switch.ch/mirror/CPAN/ UK ftp://ftp.demon.co.uk/pub/mirrors/perl/CPAN/ ftp://sunsite.doc.ic.ac.uk/packages/CPAN/ ftp://unix.hensa.ac.uk/mirrors/perl-CPAN/ North America British Columbia ftp://mango.pinc.com/pub/mirrors/CPAN/ California ftp://ftp.digital.com/pub/plan/perl/CPAN/ ftp://ftp.cdrom.com/pub/perl/CPAN/ Colorado ftp://ftp.cs.colorado.edu/pub/perl/CPAN/ Florida ftp://ftp.cis.ufl.edu/pub/perl/CPAN/ Illinois ftp://uiarchive.cso.uiuc.edu/pub/lang/perl/CPAN/ Massachusetts ftp://ftp.iguide.com/pub/mirrors/packages/perl/CPAN/ Oklahoma ftp://ftp.uoknor.edu/mirrors/CPAN/ Texas ftp://ftp.sedl.org/pub/mirrors/CPAN/ ftp://ftp.metronet.com/pub/perl/ ftp://ftp.sterling.com/CPAN/ South America Chile ftp://sunsite.dcc.uchile.cl/pub/Lang/perl/CPAN/ For those equipped with multi-protocol browsers you might pay a visit to Tom Christiansen's CPAN multiplexer whose relevant Tk URLs are: http://perl.com/cgi-bin/cpan_mod?module=Tk http://perl.com/cgi-bin/cpan_mod?module=Tk&readme=1 According to Stephen P. Potter some of the CPAN sites have decompression on the fly for people who do not have programs like gunzip. For example, at the ufl site (Florida USA) type this into your ftp session to download a gunzipped version of Tk: ftp> get Tk-b11.02.tar.gz Tk-b11.02.tar If you have the appropriate CPAN and FTP modules already installed you can retrieve a module from CPAN and carry out a complete installation with a perl one-liner like this: perl -MCPAN -e 'install "Tk"' For more information on CPAN you can send e-mail to the CPAN administrators, . If you know of some Perl resources that seem not to be in the CPAN (you did check the contents listings in indices/, didn't you?) please tell the CPAN administrators. If you have some modules/scripts/documentation yourself that you would like to contribute to CPAN, please read the file authors/00upload.howto and let the CPAN administrators know about it. About Perl ========== The Perl programming language resembles 'C', with many added features from programs provided with the UNIX operating system. Perl programs ('scripts') aren't compiled to COM or EXE files, but are run by the program PERL.EXE. The process is similar to a DOS batch file. However, PERL.EXE reads and checks an entire script before starting to run it, which makes the procedure faster and more reliable. A copy of Perl consists, at a minimum, of PERL.EXE and a text file containing the GNU licence agreement. Perl is 'copyleft' free software, and you should read and understand the licence agreement. Perl systems usually contain documentation, sample programs and additional programs: there is often a PERLGLOB.EXE program which expands wildcard filenames for Perl. The standard Perl library of scripts containing subroutines is often supplied, but often isn't much use under DOS: many of its components rely on UNIX features. Perl scripts are simple text files and can be written with any normal text editor. Under DOS, their filenames normally have an extension of .PL, but this is a convention, not a standard. The differences between UNIX Perl and DOS Perl are important. The basic programming language is the same, but many functions used for UNIX system management, or intended for operation in a multitasking environment, aren't implemented. See "DOS Perl", below, for details. Memory ====== 16-bit Perls run as conventional MS-DOS programs within the normal 640 of memory managed by MS-DOS. As PERL.EXE is at least 300Kb, there isn't much memory available for Perl program variables. BigPerl runs as a 32-bit DOS-extended program and obviates this problem; it can use all available memory (it needs at least a 4Mb RAM 386), plus virtual memory if required. It's also much faster than a 16-bit Perl, and is the only "real" implementation for modern PCs. Installation ============ It's conventional practice to install Perl and its support files in their own directory (eg, C:\PERL). Add this directory to the PATH statement in your AUTOEXEC.BAT file. Environment variables are commonly used to configure an MS-DOS Perl: - PERL: Holds the pathname of the directory where the Perl executable is installed. Not used by PERL.EXE, but useful for MS-DOS setup. Example: set perl=c:\perl - PERLLIB: Holds a list of directories which PERL.EXE will search for library and files (it adds them to the Perl array variable @INC, which is a list of directories to search. Example: set perllib=c:\perl;c:\perl\lib;c:\myscript If you use #!PERL (see below) to start Perl scripts under DOS, it can use further environment variables, described in its own documentation. Developing with Perl ==================== The write-code/test-code cycle is very short when working with Perl; when I'm tinkering with a short, but troublesome program, the cycle is often under a minute. If you don't normally use a DOS command line editing system (e.g., the DOSKEY program supplied with DOS 5.0 onwards), try it for Perl work. Octal numbers ============= By convention, UNIX usually uses octal, rather then hexadecimal numbers. Perl is supposed to support both, but it's easier to work with Perl if you learn to think in octal. DOS Perl ======== This section covers the special limitations and changes that DOS imposes on Perl. #!/usr/bin/perl --------------- The Perl documentation makes a great fuss about this piece of UNIX shell syntax for running Perl programs. This doesn't work under DOS; several tricks using batch files have been produced, but I don't find any of them satisfactory. I've created a utility program, #!PERL.EXE, which implements #! functionality via a simple trick. It is available from .../CPAN/ports/msdos/tips-tricks as hbp_40.zip (later versions will be hbp50.zip, hbp60.zip, etc). The alternative is simply to run Perl from the DOS command line and tell it the name of the script it's to run, thus: perl myprog.pl "my parameter" Wall & Schwartz, page 12 ------------------------ The first substantial example program in Wall & Schwartz won't work under DOS Perl, as it relies on the UNIX program FIND. A modified version, which uses the enhanced DIR command of DOS 5.0 onwards as a substitute, is at the end of this file. Some comments have been added and the script has been modified to a 'C'-style layout for clarity. In general, Perl scripts that use extenal UNIX commands won't work on DOS. The MKS Toolkit is a package of UNIX utilities ported to DOS that some people filed useful. Personally, I'm used to DOS and prefer not to modify it too much. Regular expressions ------------------- If you have trouble with the concept or syntax of "regular expressions" consult a book on UNIX, specifically the UNIX text-processing utilities. Unfortunately, regular expressions are so common (and so standard) in work with UNIX that many books assume a basic understanding of them. One very complete explaination is given in Sed and Awk, by Dale Dougherty, published by O'Reilly & associates in 1990. ISBN 0-937175-59-5. This describes the Sed and Awk programs, ancestors of Perl and has a very complete introduction to regular expressions. Handling \n and \r\n -------------------- By convention, UNIX programs use a 'new line' character to mark the end of a line of text. This character is written '\n'; on most UNIX machines, it is the ASCII line feed character (LF, 0x0A). DOS normally uses two characters to mark the end of a line: Carriage Return and Line Feed (CR & LF, 0x0D, 0x0A). In Perl, as with most C-based programs, these are written '\r' and '\n'. As the inner workings of Perl often assume that "end-of-line" is one character, DOS Perls treat '\r\n' as end-of-line, and then "eats" the '\r' before handing the line over to the Perl script. It then replaces '\n' with '\r\n' in its output. If you want to handle '\r' yourself, you can turn off Perl's handling of it with the binmode() function. Unimplemented words ------------------- The following Perl words and concepts are described in Wall & Schwartz, but aren't available in (some versions of) DOS Perl. Word Reason(s) ---- --------- Accept UNIX IPC function. Alarm UNIX IPC, not in Perl 3.0. Bind UNIX IPC, not in Perl 3.0. Caller Not in Perl 3.0. Chown UNIX permission system. Chroot UNIX superuser function. Connect UNIX IPC. Crypt "Unimplmented due to excessive paranoia". Dbm... Only available in BigPerl. Exec Doesn't work in BigPerl, since the DOS extender doesn't support it. In 16-bit Perls, it terminates Perl and runs the command specified. Remember to distinguish it from eval(), which runs a Perl script, inside Perl. Fcntl UNIX facility, not available on DOS. Flock UNIX facility, not available on DOS. Fork UNIX facility, not available on DOS. Get... (except getc) UNIX facilities, not available on DOS, many not in Perl 3.0. Kill UNIX IPC. Link UNIX facility, not available on DOS. Listen UNIX IPC. Msg... UNIX IPC, not in Perl 3.0. Ndbm... Only available in BigPerl. Pipe UNIX facility, not available on DOS (but the pipe functions of open() are implemented). Readlink UNIX facility, not available on DOS. Recv UNIX facility, not available on DOS. Require Not in Perl 3.0. Scalar Not in Perl 3.0. Select The Select(RBITS, WBITS, EBITS, TIMEOUT) form of select() is a UNIX function, not available on DOS. Sem... UNIX facility, not available on DOS. Send UNIX facility, not available on DOS. Set... UNIX facility, not available on DOS. Shm... UNIX facility, not available on DOS. Shutdown UNIX facility, not available on DOS. Socket... UNIX facility, not available on DOS. Symlink UNIX facility, not available on DOS. Syscall UNIX facility, not available on DOS. Sysread Not in Perl 3.0. Syswrite Not in Perl 3.0. Truncate Not in Perl 3.0. Umask UNIX facility, not available on DOS. Wait... UNIX facility, not available on DOS. Command line arguments ---------------------- -D This isn't implemented in any DOS Perl that I've seen; debuging support code would increase the size of PERL.EXE considerably. -P Don't expect this to work: most DOS machines don't have a C compiler installed, and many DOS C compilers don't allow you to use the preprocessor without compiling the file as a C program. -u This isn't implemented: see Dump() above. Different words =============== Some Perl commands operate differently on DOS. This section gives some of the important differences. If you specify pathnames with '\', Perl treats the backslash as a special character, just like C. Put pathnames in single quotes, or use '\\', or '/', which Perl file operations can use as a pathname element separator. Chdir() has been bodged in eva and BigPerl to let it change the current disk drive as well as the directory. Chmod() can only be used to set or clear the read-only attribute of DOS files. 'File attributes', below, gives details. Delete() does *not* remove files (use unlink()), it removes elements from arrays. Dump() doesn't work: it produces an "Abnormal program termination" message. DOS is one of the systems where dump() (and the -u command line argument) can't be implemented. Localtime() gives a time() value (see below) based on the current DOS clock time. Gmtime() has a fixed offset from localtime(); since DOS doesn't support the UNIX time zone system, the offset depends on the assumptions compiled into your copy of Perl. Ioctl() was partially implemented in the dds and eva implementations, but not in BigPerl. The MS-DOS IOCTL mechanism isn't nearly as general or useful as the UNIX implementation and isn't used for much. Lstat() operates as stat(); DOS doesn't have symbolic links. mkdir(): Ignore the note about subprocesses and mkdir(); it isn't an issue under DOS. Use a MODE value of 0; under UNIX, MODE can be used to create private subdirectories (see 'File attributes'), but this isn't possible under DOS, and 0 is handled as a special case. Rename() can rename files or directories, and can move them to different directories. However, it can't move a file or directory to a different DOS drive letter, even if they're on the same physical disk drive. Stat() returns 11 values. Using the notation of page 188 of Wall & Schwartz, these are: $dev Drive number: 0 for A:, 2 for C:, etc. $ino Always zero: the "inode number" under UNIX. $mode The file's attributes, in an emulation of the UNIX format. See 'File attributes' for details. $nlink Always 1: the number of links to the file under UNIX. $uid Always 0: the user ID of the file under UNIX. $gid Always 0: the group ID of the file under UNIX. $rdev Always equal to $dev: under UNIX, the 'real' device holding the file. $size File size in bytes. $atime Date/time stamp of file, seconds since 1970 (see time()). $mtime Always equal to $atime $ctime Always equal to $atime System() runs a program 'inside' Perl. Perl remains in memory (leaving about 200Kb free for the system()'ed command on a 16-bit Perl, or 500kb on BigPerl) and the Perl script continues when the program terminates. Time() returns the current time in seconds since 00:00:00, 1st Jan 1970. This is the standard format for date/time values when managing files under UNIX; the DOS format is different, but Perl converts them. Utime() sets the date and time stamps of the files it is given to process. UNIX has several stamps for each file; even though DOS only records one, you must give utime() two times (preferably identical ones). Note that the times are in seconds since 1970, as returned by time() and stat(). Unlink() deletes files. Delete() removes elements from arrays, not files from directories. File attributes =============== Both DOS and UNIX have the concept of "File attributes": various properties of a file, stored in the file's directory entry, and capable of being set On or Off. However, the attributes used by the two operating systems are very different. Perl, as a UNIX-based program, retains a UNIX viewpoint even when running under DOS. DOS has attributes to denote directory entries as Read- Only, Hidden or "System" files, as being disk volume labels or subdirectories, and as files which have been changed since they were backed up (the "Archive" flag). UNIX has a more complex system. Nine flags exist to control access or use of the file. These are the "permissions"; three further flags control the execution of programs held in the file. These flags are collectively known as the "mode" of the file, which is usually expressed as an octal number. You can use the Perl function oct() to convert a numeric character string to an octal value - see page 69 of Wall & Schwartz. The UNIX file mode is made up of the bitwise OR of the following octal values, most of which aren't meaningful under DOS. 0400 Owner can read the file 0200 Owner can write to the file 0100 Owner can execute the file or search the directory 0040 Members of owner's group can read 0020 Members of owner's group can write 0010 Members of owner's group can run/search 0004 Anybody can read. 0002 Anybody can write. 0001 Anybody can run/search 4000 Program runs under its owner's user ID 2000 Program runs under its owner's group ID 1000 Program will be run by many users at once Chmod attributes DOS Perl uses the owner's write permission bit in a UNIX- style file mode value to control the DOS Read-Only flag for each file being processed by chmod(). For example: chmod 0755, "test.doc" ; Under UNIX, this gives the file's owner read, write and execute permission and all other users read and execute permission. DOS Perl sees that the "owner" is allowed to write to the file, so it doesn't flag the file as Read- Only. chmod 0555, "test.doc" ; When handing this command, DOS Perl sees that the user doesn't have write permission, and sets the file to Read- Only. As far as I've established, none of the other DOS flags can be set by chmod(), and none of the other UNIX mode bits are significant. Stat attributes =============== The $mode field of stat()'s output produces UNIX-style attributes. Only a few values can be produced, some of which are non-standard: Octal value Meaning ----------- ------- 0020666 Console input or output: default for STDIN, STDOUT and STDERR 0100666 Disk file; read and write permitted. Redirected STDIN or STDOUT produce this value. 0100444 Read-only disk file. 0100777 COM, EXE or BAT file. STATTEST.PL, below, may help with exploring stat() results. Serial Ports ============ In theory, the COM1, COM2, COM3 and COM4 devices provided by DOS can be opened using open() and data sent to and from them. In practice this doesn't work, because DOS simply sends I/O calls for the serial ports to the BIOS routines, and these are very limited. The BIOS serial port support (INT14h) does not handle interrupts or buffering, and so is essentially useless for data rates above 1200bps. It also has rather individual ideas about handshake signals, and, in general, doesn't work peroperly. All "real" communications programs for the IBM-PC and compatibles handle the comms hardware directly. However, the situation is not hopeless. INT14h is so simple that it is easy to emulate, and this is how most systems for sharing a modem across a network operate. There is also a shareware system known as FOSSIL, which provides decent interrupt-driven buffered communications drivers for the PC which support the INT14h interface. I've never tried to use this with Perl - I've never had a project which required serial port handling - but it looks like the best option. ########################## STATTEST.PL ######################### # STATTEST.PL: Perl script to explore stat() mode values under MS-DOS. sub add_stats { $stats{$_[0]} = (stat( $_[0])) [2] ; } foreach $arg (@ARGV) { &add_stats( $arg) ; print "$arg is executable\n" if -x $arg ; } $stats{ STDERR} = (stat( STDERR)) [2] ; $stats{ STDIN} = (stat( STDIN)) [2] ; $stats{ STDOUT} = (stat( STDOUT)) [2] ; foreach $dev (sort keys( %stats)) { printf( "%s has mode %lo\n", $dev, $stats{$dev}) ; printf( "%s has mode %o\n", $dev, $stats{$dev}) ; } ########################## PAGE_12.PL ######################### # Example from page 12 of Programming Perl, for MS-DOS Perl, # amplified in places and with some notes # As these is no FIND, we use DIR. The pipe-handing shown here # resorts to some outrageous fakery under MS-DOS, but works. # Note, however, that the error message never gets used: # if dir fails, it reports its error message to its own stdout # and an empty file is returned. This needs MS-DOS 5.00 for the # /b parameter to DIR. open( FIND, "dir *.* /b |") || die "Couldn't run dir: $!\n" ; FILE: while( $filename = ) { chop( $filename) ; print "File $filename... " ; if (! -T $filename) { print "not a text file\n" ; next FILE ; } # We use -T to eliminate subdirectories, EXE files and maybe others? if (!open( TEXTFILE, $filename)) { print STDERR "Can't open $filename -- continuing...\n" ; next FILE ; } print "searching... " ; while () { foreach $word ( @ARGV) { if (index( $_, $word) >= 0) { print "found \"", $word, "\" \n" ; # We alter this message to be a little clearer next FILE ; # So we don't find multiple hits on one file } } } print "processed \n" ; } ####################### End of DOS_PERL.TXT ######################