Beginning Perl Lesson 5

Table of Contents

Stopping a script with the die() function

It is common to want a script to stop running when we encounter a situation that prevents the script from proceeding productively. For example, as we’ll see in the next section of this lesson, the script might need to read data lines from a file. But if the script can’t open the file, there will be nothing to read. In this sort of situation, we’d like a way to tell Perl to quit immediately.

For this situation, Perl provides the die() function. This function takes an optional argument, a string that contains an error message. Here’s an example:

#!/usr/bin/perl
#
#   die.pl
#   06-Jun-2004
#
#   Conrad Halling
#   conrad.halling@sphaerula.com
#
#   This script shows an example of using the die() function.

    die( "Fatal error" );

When Perl encounters die(), it prints the error message along with the name of the script and the line where die() was called and quits immediately (returning a non-zero status code to the operating system to indicate an error). The output looks like this:

Fatal error at die.pl line 11.

Reading files

One of the most common things you’ll do with Perl is read a file to get its data, work with the data, then write the results to another file. Since you’ll do this so frequently, you’ll want to master this right away.

Opening a file

We open the input file using Perl’s open() function. open() takes two arguments, the name of a filehandle and the name of the file we want to open. Opening a file for reading looks like this:

open( IN, "<$fileName" );

The first argument, the filehandle, is an identifier that Perl uses to refer to a file once it has been opened. You get to make up the name of the filehandle. It is customary in Perl to make filehandle names all upper-case.

The second argument is the name of the file we want to open. When we want to open the file for reading, we put a less than symbol (<) in front of the file name.

We need to know if Perl was able to open the file, since Perl can’t read lines from a file it can’t open. (For example, if the file name is incorrect, Perl won’t be able to open the file because it can’t find it. Or perhaps Perl can find the file, but the user running the script might not have permission to read the file.)

The open() function returns a true value when Perl opens the file successfully, but open() returns a false value when Perl can’t open the file. We can capture the result in a variable and test the variable to see if Perl opened the file successfully, like this:

$result = open( IN, "<$fileName" );
if ( $result )
{
    #   If the file is open, Perl executes the code in this block.
}
else
{
    #   If the file couldn't be opened, Perl executes the code in
    #   this block.
}

Perl has a special variable, $!, in which Perl stores an error message when Perl is unable to open a file.

If Perl can’t open the file, the best thing to do is to call the die() function to stop the script. We include the $! variable in the error string we give to die() to tell the user why the file couldn’t be opened. We would call die() like this:

$result = open( IN, "<$fileName" );
if ( $result )
{
    #   We're OK. Read lines from the file.
}
else
{
    #   The open failed, so die here.

    die( "Can't open file $fileName: $!" );
}

When the file can’t be opened, the error message will look something like this:

Can't open file data.txt for reading: No such file or directory at
show.pl line 35.

Reading lines

Once we have opened the file, we get data from the file using the angle operator (<>). But now, instead of using an empty angle operator, as we did when we wanted to get input from the user, we put the name of the filehandle in it, like this, <IN>, and we assign the result to a variable. Reading a line from a file looks like this:

$dataLine = <IN>;

Every time we use the angle operator, we get a single line from the file. To read all of the lines from the file, we need to use a while loop. But eventually, as we keep looping, we’ll exhaust the contents of the file. When this happens, the angle operator returns an undefined value. So we can determine when we’re out of data each time through the loop by seeing whether $dataLine contains a defined value. We can do this using Perl’s defined() function. We build our loop like this:

while ( defined( $dataLine = <IN> ) )
{
    #   Process the line.
}

How does this work? In Perl, things happen first in the innermost set of parentheses. So what happens first is that a line of data is obtained from the file that corresponds to the IN filehandle, and that line of data is placed into the $dataLine variable. If there is no data because we’ve reached the end of the file, then the $dataLine variable is set to an undefined value.

Next, we test the $dataLine variable using the defined() function. If $dataLine has a defined value, the expression is true, and Perl executes the code inside the while loop. If $dataLine doesn’t have a defined value, then the expression is false, and Perl skips the code inside the while loop.

Closing the open file

Finally, when we’re done reading from the file, we use Perl’s close() function to close the file. The code looks like this:

close( IN );

Example script

Here’s a complete script that puts all the pieces together. This script opens a file, reads it line by line, printing each line to the screen, then closes the file.

#!/usr/bin/perl
#
#   showFile.pl
#   06-Jun-2004
#
#   Conrad Halling
#   conrad.halling@sphaerula.com
#
#   This script shows the contents of a file. The script opens a file,
#   reads it line by line, prints each line to the screen, and closes the file.
#   The file the script opens if the script itself.

use warnings;

    my( $dataLine );
    my( $fileName );
    my( $success );

    #   Hard-code the name of the file we're going to read.
    #   This is the script itself.

    $fileName = "showFile.pl";

    #   Attempt to open the file. We create a file handle called "IN".
    #   Open returns a true value if it succeeded, an undefined
    #   value if it didn't succeed.

    $success = open( IN, "<$fileName" );
    if ( $success )
    {
        #   We opened the file successfully.
        #   We loop until we don't get any more data from the file. When the
        #   file has been exhausted, the value of $dataLine is set to undefined.
        #   So we simply test $dataLine to see if it is defined after we have
        #   obtained another line from the file.

        while ( defined( $dataLine = <IN> ) )
        {
            print( $dataLine );
        }

        #   We have read all the lines from the file.
        #   Close the file.

        close( IN );
    }
    else
    {
        #   Opening failed. The error message is stored in a special Perl
        #   variable, $!. The die() function prints an error message, then
        #   causes Perl to quit.

        die( "Can't open file '$fileName' for reading: $!.\n" );
    }

To show that the script dies when the file name is incorrect, modify the script to set $fileName to "showFile.p" instead of "showFile.pl":

$fileName = "showFile.p";

Here’s what the output looks like after this change.

Can't open file 'showFile.p' for reading: No such file or directory.

Here’s some advice about files. When you’re writing a script that will read a file, start with the functionality above. That is, start with a script that will just get a file name, open the file, print the file to the screen, and close the file. Once you know you’re reading the file correctly, then you can begin writing the code that will manipulate the file’s data.

For now, I have hard-coded the file name into the script. We could make this script more useful by modifying it to ask the user for the file name. In a later lesson, we’ll learn how the user could pass the file name on the command line.

The ! (not) operator

The ! operator reverses a boolean value. That is, it converts a true value to a false value, and it converts a false value to a true value.

We can make good use of the ! operator in our scripts when we check whether we’ve successfully opened a file. In the showFile.pl script above, we captured the value that the open() function returned in a variable named $success. Then we tested $success to determine whether we could continue or we had to call die() because we weren’t able to open the file.

Here’s the code:

$success = open( IN, "<$fileName" );
if ( $success )
{
    #   We opened the file successfully.
    #   We loop until we don't get any more data from the file. When the
    #   file has been exhausted, the value of $dataLine is set to undefined.
    #   So we simply test $dataLine to see if it is defined after we have
    #   obtained another line from the file.

    while ( defined( $dataLine = <IN> ) )
    {
        print( $dataLine );
    }

    #   We have read all the lines from the file.
    #   Close the file.

    close( IN );
}
else
{
    #   Opening failed. The error message is stored in a special Perl
    #   variable, $!. The die() function prints an error message, then
    #   causes Perl to quit.

    die( "Can't open file '$fileName' for reading: $!.\n" );
}

Usually when we open a file, we want to call die() right away if we weren’t able to open the file. This is where we can make use of the ! operator. Instead of making the first test "if success":

if ( $success )

we can test for "if not success":

if ( ! $success )

This lets us reverse the order of the branches of the if else section of our code, like this:

$success = open( IN, "<$fileName" );
if ( ! $success )
{
    #   Opening failed. The error message is stored in a special Perl
    #   variable, $!. The die() function prints an error message, then
    #   causes Perl to quit.

    die( "Can't open file '$fileName' for reading: $!.\n" );
}
else
{
    #   We opened the file successfully.
    #   We loop until we don't get any more data from the file. When the
    #   file has been exhausted, the value of $dataLine is set to undefined.
    #   So we simply test $dataLine to see if it is defined after we have
    #   obtained another line from the file.

    while ( defined( $dataLine = <IN> ) )
    {
        print( $dataLine );
    }

    #   We have read all the lines from the file.
    #   Close the file.

    close( IN );
}

This way of writing the code often feels more logical. If an error occurs, we can deal with it immediately by calling die() rather than putting this off until later.

From now on, when we open a file, we’ll test for "not success" and call die() right away if we weren’t able to open the file.

Writing files

Writing files is somewhat similar to reading files. First we have to open the file. Then we use the print() function to send text data to the file. When we’re done, we close the file.

We open the output file using Perl’s open() function. open() takes two arguments, the name of a filehandle and the name of the file we want to open. Opening a file for writing looks like this:

open( OUT, ">$fileName" );

The first argument, the filehandle, is an identifier that Perl uses to refer to a file once it has been opened. You get to make up the name of the filehandle. It is customary in Perl to make filehandle names all upper-case.

The second argument is the name of the file we want to open. When we want to open the file for writing, then we put a greater than symbol (>) in front of the file name.

We always have to make sure that the open() function returned a true value, indicating that the file was opened correctly.

We write data to the opened output file using the print() function. We pass an extra argument, the filehandle, to the print() function so it knows where to send the data. The code looks like this:

print( OUT "This is some text data.\n" );

Finally, when we’re done reading from the file, we use Perl’s close() function to close the file. The code looks like this:

close( OUT );

We’ll modify the showFile.pl script given above so that we write each line to an output file. We’ll also take advantage of the ! operator. The new script is called copyFile.pl.

#!/usr/bin/perl
#
#   copyFile.pl
#   06-Jun-2004
#
#   Conrad Halling
#   conrad.halling@sphaerula.com
#
#   This script gives an example of opening an input file and an output file.
#   We read each line from the input file and print it to the output file.
#   When we're done, we close both files.

use warnings;

    my( $dataLine );
    my( $fileName );
    my( $outFileName );
    my( $success );

    #   Hard-code the name of the file we're going to read.
    #   This is the script itself.

    $fileName = "copyFile.pl";

    #   Hard-code the name of the file we're going to write.

    $outFileName = "copyFile.pl.copy";

    #   Open the input file using filehandle IN.
    #   die() if open() fails.

    $success = open( IN, "<$fileName" );
    if ( ! $success )
    {
         die( "Can't open file $fileName for reading: $!.\n" );
    }

    #   Open the output file using filehandle OUT.
    #   die() if open() fails.

    $success = open( OUT, ">$outFileName" );
    if ( ! $success )
    {
        die( "Can't open file $outFileName for writing: $!.\n" );
    }

    #   Read each line from the input file and write it to the output file.

    while ( defined( $dataLine = <IN> ) )
    {
        print( OUT $dataLine );
    }
    close( IN );
    close( OUT );

The STDIN, STDOUT, and STDERR filehandles

In our earlier Perl scripts, we got input from the user using the angle operator, <>. Now we see that we can get input from an opened file using the angle operator containing the filehandle (e.g., <IN>). It turns out that <> is simply shorthand for <STDIN>, where STDIN is the filehandle used for standard input (that is, usually input from the keyboard).

In our earlier Perl scripts, we used print() to send output to the screen. Now we see that we can send output to an opened file using the print() function with a filehandle, like this:

print( OUT "text\n" );

It turns out that when we don’t give print() a filehandle argument, it prints by default to a filehandle called STDOUT, which is the filehandle used for standard output. That is,

print( "text\n" );

is equivalent to

print( STDOUT "text\n" );

There is a third standard filehandle, STDERR, which is the filehandle used for standard error output.

Homework assignment

Write a script to conduct a short interview. Read the interview questions from an input file, questions.txt. Get the response to each question from the user, and write the responses to an output file, responses.txt.

Here are some sample questions that you can copy into the input file, questions.txt:

What is your name?
In what city do you live?
What is one of your hobbies?
What is your favorite programming language?

Here are some guidelines for writing your script.

  1. Create a text file named questions.txt that contains your interview questions. There should be one question per line.
  2. The first version of your script should be able to open the questions.txt file, read each line and print it to the screen, then close the file. Use the examples in this lesson when writing your code.
  3. Now add code that opens the output file, responses.txt. To demonstrate that you can write correctly to the output file, write each question obtained from the questions.txt input file to the responses.txt output file. Make sure that each question occurs on its own line in the output file. Make sure to close the output file when you’ve finished writing.
  4. Now change your script so that you use the <> operator to get the user’s response to each question. For review, see getting user input using <> in Lesson 3. Write the user’s response to the responses.txt output file. Make sure that each response occurs on its own line in the output file.