Previous Page Next Page

Chapter 10. Getting a Handle on Files

10.1. The User-Defined Filehandle

If you are processing text, you will regularly be opening, closing, reading from, and writing to files. In Perl, we use filehandles to get access to system files.

A filehandle is a name for a file, device, pipe, or socket. In Chapter 4, "Getting a Handle on Printing," we discussed the three default filehandles, STDIN, STDOUT, and STDERR. Perl allows you to create your own filehandles for input and output operations on files, devices, pipes, or sockets. A filehandle allows you to associate the filehandle name with a system file[1] and to use that filehandle to access the file.

[1] A system file would be a UNIX, Win32, Macintosh file, etc., stored on the system's disk.

10.1.1. Opening Files—The open Function

The open function lets you name a filehandle and the file you want to attach to that handle. The file can be opened for reading, writing, or appending (or both reading and writing), and the file can be opened to pipe data to or from a process. The open function returns a nonzero result if successful and an undefined value if it fails. Like scalars, arrays, and labels, filehandles have their own namespace. So that they will not be confused with reserved words, it is recommended that filehandle names be written in all uppercase letters. (See the open function in Appendix A.)

When opening text files on Win32 platforms, the \r\n (characters for return and newline) are translated into \n when text files are read from disk, and the ^Z character is read as an end-of-file marker (EOF). The following functions for opening files should work fine with text files but will cause a problem with binary files. (See "Win32 Binary Files" on page 292.)

10.1.2. Open for Reading

The following examples illustrate how to open files for reading. Even though the examples represent UNIX files, they will work the same way on Windows, Mac OS, etc.

Format

1   open(FILEHANDLE, "FILENAME");
2   open(FILEHANDLE, "<FILENAME");
2   open(FILEHANDLE);
3   open FILEHANDLE;

Example 10.1.

1   open(MYHANDLE, "myfile");
2   open (FH, "</etc/passwd");
3   open (MYHANDLE);

Explanation

  1. The open function will create the filehandle MYHANDLE and attach it to the system file myfile. The file will be opened for reading. Since a full pathname is not specified for myfile, it must be in the current working directory, and you must have read permission to open it for reading.

  2. The open function will create the filehandle FH and attach it to the system file /etc/passwd. The file will be opened for reading, but this time the < symbol is used to indicate the operation. The < symbol is not necessary but may help clarify that this is a read operation. The full pathname is specified for passwd.

  3. If FILENAME is omitted, the name of the filehandle is the same name as a scalar variable previously defined. The scalar variable has been assigned the name of the real file. In the example, the filename could have been defined as

    $MYHANDLE="myfile";
    open(MYHANDLE);

    The open function will create the filehandle MYHANDLE and attach it to the value of the variable, $MYHANDLE. The effect will be the same as the first example. The parentheses are optional.


Closing the Filehandle

The close function closes the file, pipe, socket, or device attached to FILEHANDLE. Once FILEHANDLE is opened, it stays open until the script ends or you call the open function again. (The next call to open closes FILEHANDLE before reopening it.) If you don't explicitly close the file, when you reopen it this way, the line counter variable, $., will not be reset. Closing a pipe causes the process to wait until the pipe is complete and reports the status in the $! variable (see "The die Function" on page 287). It's a good idea to explicitly close files and handles after you are finished using them.

Format

close (FILEHANDLE);
close FILEHANDLE;

Example 10.2.

1   open(INFILE, "datebook");
    close(INFILE);

Explanation

  1. The user-defined filehandle INFILE will be closed.


The die Function

In the following examples, the die function is used if a call to the open function fails. If Perl cannot open the file, the die function is used to exit the Perl script and print a message to STDERR, usually the screen.

If you were to go to your shell or MS-DOS prompt and type

cat junk (UNIX)

or

type junk (DOS)

and if junk is a nonexistent file, the following system error would appear on your screen:

cat: junk: No such file or directory    (UNIX "cat" command)
The system cannot find the file specified.   (Windows "type" command)

When using the die function, Perl provides a special variable $! to hold the value of the system error (see "Error Handling" on page 755) that occurs when you are unable to successfully open a file or execute a system utility. This is very useful for detecting a problem with the filehandle before continuing with the execution of the script. (See use of Carp.pm discussed in Example 12.10 on page 384.)

Example 10.3.

(Line from Script)
1   open(MYHANDLE, "/etc/password) || die  "Can't open: $!\n";
2   open(MYHANDLE, "/etc/password) or die  "Can't open: $!\n";

(Output)
1   Can't open: No such file or directory
(Line from Script)
3   open(MYHANDLE,   "/etc/password") || die  "Can't open: ";

(Output)
3   Can't open: No such file or directory at ./handle line 3.

Explanation

  1. When trying to open the file /etc/password, the open fails (it should be /etc/passwd). The short-circuit operator causes its right operand to execute if the left operand fails. The die operator is executed. The string Can't open: is printed, followed by the system error No such file or directory. The \n suppresses any further output from the die function. All of die's output is sent to STDERR after the program exits.

  2. For readability, you may want to use the or operator instead of ||.

  3. This is exactly like the first example, except that the \n has been removed from the string Can't open:. Omitting the \n causes the die function to append a string to the output, indicating the line number in the script where the system error occurred.

Reading from the Filehandle

In Example 10.4, a file called datebook is opened for reading. Each line read is assigned, in turn, to $_, the default scalar that holds what was just read until the end of file is reached.

Example 10.4.

(The Text File: datebook)
    Steve Blenheim
    Betty Boop
    Lori Gortz
    Sir Lancelot
    Norma Cord
    Jon DeLoach
    Karen Evich

----------------------------------------------------------------

(The Script)
    #!/usr/bin/perl
    # Open a file with a filehandle
1   open(FILE, "datebook") || die "Can't open datebook: $!\n";
2   while(<FILE>) {
3       print if /Sir Lancelot/;
4   }
5   close(FILE);

(Output)
3   Sir Lancelot

Explanation

  1. The open function will create a filehandle called FILE (opened for reading) and attach the system file datebook to it. If open fails because the file datebook does not exist, the die operator will print to the screen, Can't open datebook: No such file or directory.

  2. The expression in the while loop is the filehandle FILE, enclosed in angle brackets. The angle brackets are the operators used for reading input. (They are not part of the filehandle name.) When the loop starts, the first line from the filehandle FILE will be stored in the $_ scalar variable. (Remember, the $_ variable holds each line of input from the file.) If it has not reached end of file, the loop will continue to take a line of input from the file, execute statements 3 and 4, and continue until end of file is reached.

  3. The default input variable $_ is implicitly used to hold the current line of input read from the filehandle. If the line contains the regular expression Sir Lancelot, that line (stored in $_) is printed to STDOUT. For each loop iteration, the next line read is stored in $_ and tested.

  4. The closing curly brace marks the end of the loop body. When this line is reached, control will go back to the top of the loop (line 2) and the next line of input will be read from file; this process will continue until all the lines have been read.

  5. After looping through the file, the file is closed by closing the filehandle.

Example 10.5.

(The Text File: datebook)
    Steve Blenheim
    Betty Boop
    Lori Gortz
    Sir Lancelot
    Norma Cord
    Jon DeLoach
    Karen Evich

----------------------------------------------------------------

(The Script)
    #!/usr/bin/perl
    #Open a file with a filehandle
1   open(FILE, "datebook") || die "Can't open datebook: $!\n";
2   while($line = <FILE>) {
3       print "$line" if $line =~ /^Lori/;
4   }
5   close(FILE);

(Output)
3   Lori Gortz

Explanation

  1. The datebook file is opened for reading.

  2. When the while loop is entered, a line is read from the file and stored in the scalar $line.

  3. The value of the scalar $line is printed if it contains the pattern Lori, and Lori is at the beginning of the line.

  4. When the closing brace is reached, control goes back to line 2, and another line is read from the file. The loop ends when the file has no more lines.

  5. The file is closed by closing the filehandle.

Example 10.6.

(The Text File: datebook)
    Steve Blenheim
    Betty Boop
    Lori Gortz
    Sir Lancelot
    Norma Cord
    Jon DeLoach
    Karen Evich

----------------------------------------------------------------

(The Script)
    #!/usr/bin/perl
    # Open a file with a filehandle
1   open(FILE, "<datebook") || die "Can't open datebook: $!\n";
2   @lines = <FILE>;
3   print @lines;       # Contents of the entire file are printed
4   print "\nThe datebook file contains ", $#lines + 1,
          " lines of text.\n";
5   close(FILE);

(Output)
The datebook file contains 7 lines of text.

Explanation

  1. The datebook file is opened for reading. (The < read operator is not required.)

  2. All of the lines are read from the file, via the filehandle, and assigned to @lines, where each line is an element of the array. The newline terminates each element.

  3. The array @lines is printed.

  4. The value of $#lines is the number of the last subscript in the array. By adding one to $#lines, the number of elements (lines) is printed. A –1 offset, $lines[–1], will also print the last line.

10.1.3. Open for Writing

To open a file for writing, the file will be created if it does not exist, and if it already exists, it must have write permission. If the file exists, its contents will be overwritten. The filehandle is used to access the system file.

Format

1   open(FILEHANDLE, ">FILENAME)";

Example 10.7.

1   open(MYOUTPUT, ">temp");

Explanation

  1. The user-defined filehandle MYOUTPUT will be used to send output to the file called temp. As with the shell, the redirection symbol directs the output from the default filehandle, STDOUT, to the temp file.


Example 10.8.

(The Script)
    #!/usr/bin/perl
    # Write to a file with a filehandle. Scriptname: file.handle
1   $file="/home/jody/ellie/perl/newfile";
2   open(HANDOUT, ">$file") || die "Can't open newfile: $!\n";

3   print HANDOUT "hello world.\n";
4   print HANDOUT "hello world again.\n";

(At the Command Line)
5   $ perl file.handle
6   $ cat newfile

(Output)
3   hello world.
4   hello world, again.

Explanation

  1. The scalar variable $file is set to the full pathname of a UNIX file called newfile. The scalar will be used to represent the name of the UNIX file to which output will be directed via the filehandle. This example will work the same way with Windows, but if you use the backslash as a directory separator, either enclose the path in single quotes, or use two backslashes; e.g., C:\\home\\ellie\\testing.

  2. The user-defined filehandle HANDOUT will change the default place to where output normally goes, STDOUT, to the file that it represents, newfile. The > symbol indicates that newfile will be created if it does not exist and opened for writing. If it does exist, it will be opened and any text in it will be overwritten, so be careful!

  3. The print function will send its output to the filehandle, HANDOUT, instead of to the screen. The string hello world. will be written into newfile via the HANDOUT filehandle. The file newfile will remain open unless it is explicitly closed or the Perl script ends (see "Closing the Filehandle" on page 286).

  4. The print function will send its output to the filehandle HANDOUT instead of to the screen. The string hello world, again. will be written into newfile via the HANDOUT filehandle. The operating system keeps track of where the last write occurred and will send its next line of output to the location immediately following the last byte written to the file.

  5. The script is executed. The output is sent to newfile.

  6. The contents of the file newfile are printed.

10.1.4. Win32 Binary Files

Win32 distinguishes between text and binary files. If ^Z is found, the program may abort prematurely or have problems with the newline translation. When reading and writing Win32 binary files, use the binmode function to prevent these problems. The binmode function arranges for a specified filehandle to be read or written to in either binary (raw) or text mode. If the discipline argument is not specified, the mode is set to "raw." The discipline is one of :raw, :crlf, :text, :utf8, :latin1, etc. (UNIX and Mac OS do not need binmode. They delimit lines with a single character and encode that character "\n".)

Format

binmode FILEHANDLE
binmode FILEHANDLE, DISCIPLINE

Example 10.9.

# This script copies one binary file to another.
# Note its use of binmode to set the mode of the filehandle.

1   $infile="statsbar.gif";
2   open( INFILE, "<$infile" );
3   open( OUTFILE, ">outfile.gif" );

4   binmode( INFILE );     # Crucial for binary files!

5   binmode( OUTFILE );
    # binmode should be called after open() but before any I/O
    # is done on the filehandle.
6   while ( read( INFILE, $buffer, 1024 ) ) {
7       print OUTFILE $buffer;
    }

8   close( INFILE );
    close( OUTFILE );

Explanation

  1. The scalar $infile is assigned a .gif filename.

  2. The file statsbar.gif is opened for reading and attached to the INFILE filehandle.

  3. The file outfile.gif is opened for writing and assigned to the OUTFILE filehandle.

  4. The binmode function arranges for the input file to be read as binary text.

  5. The binmode function arranges for the output file to be written as binary text.

  6. The read function reads 1,024 bytes at a time, storing the input read in the scalar $buffer.

  7. After the 1,024 bytes are read in, they are sent out to the output file.

  8. Both filehandles are closed. The result was that one binary file was copied to another binary file.


10.1.5. Open for Appending

To open a file for appending, the file will be created if it does not exist, and if it already exists, it must have write permission. If the file exists, its contents will be left intact, and the output will be appended to the end of the file. Again, the filehandle is used to access the file rather than accessing it by its real name.

Format

1   open(FILEHANDLE, ">> FILENAME");

Example 10.10.

1   open(APPEND, ">> temp");

Explanation

  1. The user-defined filehandle APPEND will be used to append output to the file called temp. As with the shell, the redirection symbol directs the output from the default, standard out filehandle, STDOUT, to the temp file.


Example 10.11.

(The Text File)
$ cat newfile
hello world.
hello world, again.

(The Script)
    #!/usr/bin/perl
1   open(HANDLE, ">>newfile") ||
           die print "Can't open newfile: $!\n";
2   print HANDLE "Just appended \"hello world\"
           to the end of newfile.\n";

(Output)
$ cat newfile
hello world.
hello world, again.
Just appended "hello world" to the end of newfile.

Explanation

  1. The user-defined filehandle HANDLE will be used to send and append output to the file called newfile. As with the shell, the redirection symbol directs the output from the default filehandle, STDOUT, and appends the output to the file newfile. If the file cannot be opened because, for example, the write permissions are turned off, the die operator will print the error message, Can't open newfile: Permission denied., and the script will exit.

  2. The print function will send its output to the filehandle, HANDLE, instead of to the screen. The string, Just appended "hello world" to the end of newfile, will be written to end of newfile via the HANDLE filehandle.

10.1.6. The select Function

The select function sets the default output to the specified FILEHANDLE and returns the previously selected filehandle. All printing will go to the selected handle.

Example 10.12.

(The Script)
    #! /usr/bin/perl
1   open (FILEOUT,">newfile")  || die "Can't open newfile: $!\n";
2   select(FILEOUT);      # Select the new filehandle for output
3   open (DB, "<datebook") || die "Can't open datebook: $!\n";
    while(<DB>) {
4      print ;            # Output goes to FILEOUT, i.e., newfile
    }
5   select(STDOUT);       # Send output back to the screen
    print "Good-bye.\n";  # Output goes to the screen

Explanation

  1. newfile is opened for writing and assigned to the filehandle FILEOUT.

  2. The select function assigns FILEOUT as the current default filehandle for output. The return value from the select function is the name of the filehandle that was closed (STDOUT) in order to select FILEOUT, the one that is now opened for writing.

  3. The DB filehandle is opened for reading.

  4. As each line is read into the $_ variable from DB, it is then printed to the currently selected filehandle, FILEOUT. Notice that you don't have to name the filehandle.

  5. By selecting STDOUT, the rest of the program's output will go to the screen.

10.1.7. File Locking with flock

To prevent two programs from writing to a file at the same time, you can lock the file so you have exclusive access to it and then unlock it when you're finished using it. The flock function takes two arguments: a filehandle and a file locking operation. The operations are listed in Table 10.1.[2]

[2] File locking may not be implemented on non-UNIX systems.

Table 10.1. File Locking Operations
NameOperationWhat It Does
lock_sh1Creates a shared lock
lock_ex2Creates an exclusive lock
lock_nb4Creates a nonblocking lock
lock_un8Unlocks an existing lock


Read permission is required on a file to obtain a shared lock, and write permission is required to obtain an exclusive lock. With operations 1 and 2, normally the caller requesting the file will block (wait) until the file is unlocked. If a nonblocking lock is used on a filehandle, an error is produced immediately if a request is made to get the locked file.[3]

[3] flock may not work if the file is being accessed from a networked system.

Example 10.13.

    #!/bin/perl
    # Program that uses file locking -- UNIX
1   $LOCK_EX = 2;
2   $LOCK_UN = 8;

3   print "Adding an entry to the datafile.\n";
    print "Enter the name: ";
    chomp($name=<STDIN>);
    print "Enter the address: ";
    chomp($address=<STDIN>);

4   open(DB, ">>datafile") || die "Can't open: $!\n";

5   flock(DB, $LOCK_EX) || die ;        # Lock the file

6   print DB "$name:$address\n";

7   flock(DB, $LOCK_UN) || die;         # Unlock the file

Explanation

  1. The scalar is assigned the value of the operation that will be used by the flock function to lock the file. This operation is to block (wait) until an exclusive lock can be created.

  2. This operation will tell flock when to unlock the file so others can write to it.

  3. The user is asked for the information to update the file. This information will be appended to the file.

  4. The filehandle is opened for appending.

  5. The flock function puts an exclusive lock on the file.

  6. The data is appended to the file.

  7. Once the data has been appended, the file is unlocked so others can access it.

10.1.8. The seek and tell Functions

The seek Function

Seek allows you to randomly access a file. The seek function is the same as the fseek standard I/O function in C. Rather than closing the file and then reopening it, the seek function allows you to move to some byte (not line) position within the file. The seek function returns 1 if successful, 0 otherwise.

Format

seek(FILEHANDLE, BYTEOFFSET, FILEPOSITION);


The seek function sets a position in a file, where the first byte is 0. Positions are

0 = Beginning of the file

1 = Current position in the file

2 = End of the file

The offset is the number of bytes from the file position. A positive offset moves the position forward in the file; a negative offset moves the position backward in the file for position 1 or 2.

The od command lets you look at how the characters in a file are stored. This file was created on a Win32 platform; on UNIX systems, the linefeed/newline is one character, \n.

$ od -c db
0000000000   S   t   e   v   e        B   l   e   n   h   e   i    m  \r  \n
0000000020   B   e   t   t   y        B   o   o   p  \r  \n   L   o   r    i
0000000040   G   o   r   t   z   \r  \n   S   i   r       L   a   n   c
0000000060   e   l   o   t   \r  \n   N   o   r   m   a       C   o   r    d
0000000100   \r  \n  J   o   n        D   e   L   o   a   c   h  \r  \n    K
0000000120   a   r   e   n       E    v   i   c   h  \r  \n
0000000134

					  

Example 10.14.

(The Text File: db)
Steve Blenheim
Betty Boop
Lori Gortz
Sir Lancelot
Norma Cord
Jon DeLoach
Karen Evich

----------------------------------------------------------------

(The Script)
    # Example using the seek function
1   open(FH,"db") or die "Can't open: $!\n";
2   while($line=<FH>){       # Loop through the whole file
3       if ($line =~ /Lori/) { print "––$line––\n";}
     }
4   seek(FH,0,0);            # Start at the beginning of the file
5   while(<FH>) {
6      print if /Steve/;
    }

(Output)
3   --Lori Gortz--
6   Steve Blenheim

					  

Explanation

  1. The db file is assigned to the FH filehandle and opened for reading.

  2. Each line of the file is assigned, in turn, to the scalar $line while looping through the file.

  3. If $line contains Lori, the print statement is executed.

  4. The seek function causes the file pointer to be positioned at the top of the file (position 0) and starts reading at byte 0, the first character. If you want to get back to the top of the file without using seek, the filehandle must first be explicitly closed with the close function.

  5. Starting at the top of the file, the loop is entered. The first line is read from the filehandle and assigned to $_, the default line holder.

  6. If the pattern Steve is found in $_, the line will be printed.

Example 10.15.

(The Text File: db)
Steve Blenheim
Betty Boop
Lori Gortz
Sir Lancelot
Norma Cord
Jon DeLoach
Karen Evich

----------------------------------------------------------------

(The Script)
1   open(FH, "db") or die "Can't open datebook: $!\n";
2   while(<FH>){
3       last if /Norma/;   # This is the last line that
                           # will be processed
    }
4   seek(FH,0,1) or die;   # Seeking from the current position
5   $line=<FH>;            # This is where the read starts again
6   print "$line";
7   close FH;

(Output)
6   Jon DeLoach

Explanation

  1. The db file is opened for reading via the FH filehandle.

  2. The while loop is entered. A line from the file is read and assigned to $_.

  3. When the line containing the pattern Norma is reached, the last function causes the loop to be exited.

  4. The seek function will reposition the file pointer at the byte position 0 where the next read operation would have been performed in the file, position 1: in other words, the line right after the line that contained Norma. The byte position could be either a negative or positive value.

  5. A line is read from the db file and assigned to the scalar $line. The line read is the line that would have been read just after the last function caused the loop to exit.

  6. The value of $line is printed.

Example 10.16.

(The Script)
1   open(FH, "db") or die "Can't open datebook: $!\n";
2   seek(FH,-13,2) or die;
3   while(<FH>){
4       print;
    }

(Output)
4   Karen Evich

Explanation

  1. The db file is opened for reading via the FH filehandle.

  2. The seek function starts at the end of the file (position 2) and backs up 13 bytes. The newline (\r\n), although not visible, is represented as the last two bytes in the line (Windows).

  3. The while loop is entered, and each line, in turn, is read from the filehandle db.

  4. Each line is printed. By backing up 13 characters from the end of the file, Karen Evich is printed. Note the output of the od -c command and count back 13 characters from the end of the file.

    0000000000     S   t   e   v   e       B   l   e   n   h   e   i    m  \r  \n
    0000000020     B   e   t   t   y       B   o   o   p  \r  \n   L   o   r    i
    0000000040         G   o   r   t   z  \r  \n   S   i   r       L   a   n    c
    0000000060     e   l   o   t  \r  \n   N   o   r   m   a       C   o   r    d
    0000000100    \r  \n   J   o   n       D   e   L   o   a   c   h  \r  \n    K
    0000000120     a   r   e   n       E   v   i   c   h  \r  \n
    0000000134
    
    					  

The tell Function

The tell function returns the current byte position in the file and is used with the seek function to move to that position in the file. If FILEHANDLE is omitted, tell returns the position of the file last read.

Format

tell(FILEHANDLE);
tell;

Example 10.17.

(The Text File: db)
Steve Blenheim
Betty Boop
Lori Gortz
Sir Lancelot
Norma Cord
Jon DeLoach
Karen Evich

----------------------------------------------------------------

(The Script)
    #!/usr/bin/perl
    # Example using the tell function
1   open(FH,"db") || die "Can't open: $!\n";
2   while ($line=<FH>) {      # Loop through the whole file
       chomp($line);
3      if ($line =~ /^Lori/) {
4          $currentpos=tell;
5          print "The current byte position is $currentpos.\n";
6          print "$line\n\n";
       }
    }
7   seek(FH,$currentpos,0);   # Start at the beginning of the file
8   @lines=(<FH>);
9   print @lines;

(Output)
5   The current byte position is 40.
6   Lori Gortz

9   Sir Lancelot
    Norma Cord
    Jon DeLoach
    Karen Evich

					  

Explanation

  1. The db file is assigned to the FH filehandle and opened for reading.

  2. Each line of the file is assigned, in turn, to the scalar $line while looping through the file.

  3. If the scalar $line contains the regular expression Lori, the if block is entered.

  4. The tell function is called and returns the current byte position (starting at byte 0) in the file. This represents the position of the first character in the line that was just read in after the line containing Lori was processed.

  5. The value in bytes is stored in $currentpos. It is printed. Byte position 40 represents the position where Sir Lancelot starts the line.

  6. The line containing the regular expression Lori is printed.

  7. The seek function will position the file pointer for FH at the byte offset, $currentpos, 40 bytes from the beginning of the file. Without seek, the filehandle would have to be closed in order to start reading from the top of the file.

  8. The lines starting at offset 40 are read in and stored in the array @lines.

  9. The array is printed, starting at offset 40.


10.1.9. Open for Reading and Writing

Table 10.2. Reading and Writing Operations
SymbolOpen For
+<Read first, then write
+>Write first, then read
+>>Append first, then read


Example 10.18.

(The Script)
    # Scriptname: countem.pl
    # Open visitor_count for reading first, and then writing
1   open(FH, "+<visitor_count") ||
           die "Can't open visitor_count: $!\n";
2   $count=<FH>;           # Read a number from from the file
3   print "You are visitor number $count.";
4   $count++;
5   seek(FH, 0,0) || die;  # Seek back to the top of the file
6   print FH $count;       # Write the new number to the file
7   close(FH);

(Output)
(First run of countem.pl)
You are visitor number 1.

(Second run of countem.pl)
You are visitor number 2.


Explanation

  1. The file visitor_count is opened for reading first, and then writing. If the file does not exist or is not readable, die will cause the program to exit with an error message.

  2. A line is read from the visitor_count file. The first time the script is executed, the number 1 is read in from visitor_count file and stored in the scalar $count.

  3. The value of $count is printed.

  4. The $count scalar is incremented by 1.

  5. The seek function moves the file pointer to the beginning of the file.

  6. The new value of $count is written back to the visitor_count file. The number that was there is overwritten by the new value of $count each time the script is executed.

  7. The file is closed.

Example 10.19.

(The Script)
    #!/usr/bin/perl
    # Open for writing first, then reading
    print "\n\n";
1   open(FH, "+>joker") || die;
2   print FH "This line is written to joker.\n";
3   seek(FH,0,0);         # Go to the beginning of the file
4   while(<FH>) {
5       print;            # Reads from joker; the line is in $_
    }

(Output)
5   This line is written to joker.

Explanation

  1. The filehandle FH is opened for writing first. This means that the file joker will be created or, if it already exists, it will be truncated. Be careful not to mix up +< and +>.

  2. The output is sent to joker via the FH filehandle.

  3. The seek function moves the filepointer to the beginning of the file.

  4. The while loop is entered. A line is read from the file joker via the FH filehandle and stored in $_.

  5. Each line ($_) is printed after it is read until the end of the file is reached.

10.1.10. Open for Pipes

When using a pipe (also called a filter), a connection is made from one program to another. The program on the left-hand side of a pipe sends its output into a temporary buffer. This program writes into the pipe. On the other side of the pipe is a program that is a reader. It gets its input from the buffer. Here is an example of a typical UNIX pipe:

who | wc -l

and an MS-DOS pipe:

dir /B | more

The output of the who command is sent to the wc command. The who command sends its output to the pipe; it writes to the pipe. The wc command gets its input from the pipe; it reads from the pipe. (If the wc command were not a reader, it would ignore what is in the pipe.) The output is sent to the STDOUT, the terminal screen. The number of people logged on is printed. If Perl is on the left-hand side of a pipe, then Perl is the writer and sends output to the buffer; if Perl is on the right-hand side of the pipe, then Perl reads from the buffer. It is important to keep in mind that the process connecting to Perl is an operating system command. If you are running Perl on a UNIX or Linux system, the commands will be different from those on a Windows system, thereby making Perl scripts implementing pipes unportable between systems.

Figure 10.1. UNIX pipe example.


The Output Filter

When creating a filehandle with the open function, you can open a filter so that the output is piped to a system command. The command is preceded by a pipe symbol (|) and replaces the filename argument in the previous examples. The output will be piped to the command and sent to STDOUT.

Format

1   open(FILEHANDLE,|COMMAND);

Example 10.20.

(The Script)
    #!/bin/perl
    # Scriptname: outfilter (UNIX)
1   open(MYPIPE, "| wc -w");
2   print MYPIPE "apples pears peaches";
3   close(MYPIPE);

(Output)
3

Explanation

  1. The user-defined filehandle MYPIPE will be used to pipe output from the Perl script to the UNIX command wc -w, which counts the number of words in the string.

  2. The print function sends the string apples pears peaches to the output filter filehandle MYPIPE; the string is piped to the wc command. Since there are three words in the string, the output 3 will be sent to the screen.

  3. After you have finished using the filehandle, use the close function to close it. This guarantees that the command will complete before the script exits. If you don't close the filehandle, the output may not be flushed properly.


Figure 10.2. Perl output filter.


Example 10.21.

(The Script)
1   open(FOO, "| tr '[a-z]' '[A-Z]'");
2   print FOO "hello there\n";
3   close FOO;   # If you don't close FOO, the output may be delayed

(Output)
2   HELLO THERE

Explanation

  1. The user-defined filehandle FOO will be used to send output from your Perl script to the UNIX command tr, which will translate lowercase letters to uppercase.

  2. The print function sends the string hello there to the output filter filehandle FOO; that is, the string is piped to the tr command. The string, after being filtered, will be sent to the screen with all the characters translated to uppercase.

Example 10.22.

(The Text File)
$ cat emp.names
1 Steve Blenheim
2 Betty Boop
3 Igor Chevsky
4 Norma Cord
5 Jon DeLoach
6 Karen Evich

(The Script)
    #!/usr/bin/perl
1   open(FOO, "| sort  +1| tr '[a-z]' '[A-Z]'"); # Open output filter
2   open(DB, "emp.names");       # Open DB for reading
3   while(<DB>){ print FOO ; }
4   close FOO;

(Output)
2   BETTY BOOP
3   IGOR CHEVSKY
5   JON DELOACH
6   KAREN EVICH
4   NORMA CORD
1   STEVE BLENHEIM

Explanation

  1. The user-defined filehandle FOO will be used to pipe output to the UNIX command sort, and the output of sort will be piped to the tr command. The sort +1 command sorts on the second field, where fields are words separated by whitespace. The UNIX tr command translates lowercase letters into uppercase letters.

  2. The open function creates the filehandle DB and attaches it to the UNIX file emp.names.

  3. The expression in the while loop contains the filehandle DB, enclosed in angle brackets, indicating a read operation. The loop will read the first line from the emp.names file and store it in the $_ scalar variable. The input line will be sent through the output filter, FOO, and printed to the screen. The loop will iterate until end of file is reached. Note that when the file is sorted by the second field, the numbers in the first column are no longer sorted.

  4. The close function closes the filehandle FOO.

Sending the Output of a Filter to a File

In the previous example, what if you had wanted to send the output of the filter to a file intead of to STDOUT? You can't send output to a filter and a filehandle at the same time, but you can redirect STDOUT to a filehandle. Since, later in the program, you may want STDOUT to be redirected back to the screen, you can first save it or simply reopen STDOUT to the terminal device by typing

open(STDOUT, ">/dev/tty");

Example 10.23.

    #!/usr/bin/perl
    # Program to redirect STDOUT from filter to a UNIX file
1   $| = 1;           # Flush buffers
2   $tmpfile = "temp";
3   open(DB, "data") || die qq/Can't open "data": $!\n/;
                                            # Open DB for reading
4   open(SAVED, ">&STDOUT") || die "$!\n";  # Save stdout
5   open(STDOUT, ">$tmpfile" ) || die "Can't open: $!\n";
6   open(SORT, "| sort +1") || die;         # Open output filter
7   while(<DB>){
8      print SORT;   # Output is first sorted and then sent to temp.
9   }
10  close SORT;
11  open(STDOUT, ">&SAVED") || die "Can't open";
12  print "Here we are printing to the screen again.\n";
                      # This output will go to the screen
13  rename("temp","data");

Explanation

  1. The $| variable guarantees an automatic flush of the output buffer after each print statement is executed. (See autoflush module in Appendix A.)

  2. The scalar $tmpfile is assigned temp to be used later as an output file.

  3. The UNIX data file is opened for reading and attached to the DB filehandle.

  4. STDOUT is being copied and saved in another filehandle called SAVED. Behind the scenes, the file descriptors are being manipulated.

  5. The temp file is being opened for writing and is assigned to the file descriptor normally reserved for STDOUT, the screen. The file descriptor for STDOUT has been closed and reopened for temp.

  6. The output filter will be assigned to SORT. Perl's output will be sent to the UNIX sort utility.

  7. The DB filehandle is opened for reading.

  8. The output filehandle will be sent to the temp file after being sorted.

  9. Close the loop.

  10. Close the output filter.

  11. Open the standard output filehandle so that output is redirected back to the screen.

  12. This line prints to the screen because STDOUT has been reassigned there.

  13. The temp file is renamed data, overwriting what was in data with the contents of temp.

Input Filter

When creating a filehandle with the open function, you can also open a filter so that input is piped into Perl. The command ends with a pipe symbol.

Format

open(FILEHANDLE, COMMAND|);

Example 10.24.

    #!/bin/perl
    # Scriptname: infilter
1   open(INPIPE, "date |");    # Windows (2000/NT) use:  date /T
2   $today = <INPIPE> ";
3   print $today;
4   close(INPIPE);

(Output)
Sun Feb 18 14:12:44 PST 2007

Explanation

  1. The user-defined filehandle INPIPE will be used to pipe the output from the filter as input to Perl. The output of a UNIX date command will be used as input by your Perl script via the INPIPE filehandle. Windows 2000/NT users: use date /T.

  2. The scalar $today will receive its input from the INPIPE filehandle; in other words, Perl reads from INPIPE.

  3. The value of the UNIX date command was assigned to $today and is displayed.

  4. After you have finished using the filehandle, use the close function to close it. This guarantees that the command will complete before the script exits. If you don't close the filehandle, the output may not be flushed properly.


Figure 10.3. Perl input filter.


Example 10.25.

(The Script)
1   open(FINDIT, "find . -name 'perl*' -print |") ||
           die "Couldn't execute find!\n";
2   while( $filename = <FINDIT> ){
3      print $filename;
    }

(Output)
3   ./perl2
    ./perl3
    ./perl.man
    ./perl4
    ./perl5
    ./perl6
    ./perl7
    ./perlsub
    ./perl.arg

Explanation

  1. The output of the UNIX find command will be piped to the input filehandle FINDIT. When enclosed in angle brackets, the standard input will come from FINDIT instead of STDIN. If the open fails, the die operator will print Couldn't execute find! and exit the script.

  2. The output from the UNIX find command has been piped into the filehandle FINDIT. For each iteration of the while loop, one line from the FINDIT filehandle will be assigned to the scalar variable $filename.

  3. The print function prints the value of the variable $filename to the screen.

Example 10.26.

(The Script)
    # Opening an input filter on a Win32 platform
1   open(LISTDIR, 'dir "C:\perl" |') || die;
2   @filelist = <LISTDIR>;
3   foreach  $file ( @filelist ){
        print $file;
   }

(Output)
 Volume in drive C is 010599
 Volume Serial Number is 2237-130A

 Directory of C:\perl

03/31/1999  10:34p      <DIR>           .
03/31/1999  10:34p      <DIR>           ..
03/31/1999  10:37p               30,366 DeIsL1.isu
03/31/1999  10:34p      <DIR>           bin
03/31/1999  10:34p      <DIR>           lib
03/31/1999  10:35p      <DIR>           html
03/31/1999  10:35p      <DIR>           eg
03/31/1999  10:35p      <DIR>           site
               1 File(s)          30,366 bytes
               7 Dir(s)      488,873,984 bytes free

Explanation

  1. The output of the Windows dir command will be piped to the input filehandle LISTDIR. When enclosed in angle brackets, the standard input will come from dir instead of STDIN. If the open fails, the die operator will print an error and exit the script.

  2. The output from the Windows dir command has been piped into the filehandle LISTDIR. The input is read from the filehandle and assigned to the array @filelist. Each element of the array represents one line of input.

  3. The foreach loop iterates through the array, printing one line at a time until the end of the array.

Previous Page Next Page