| |
Book_LearnPerl
Page history last edited by Anonymous 1 yr ago

Book Details
Learning Perl 4th edtion
- Randal L. Schwartz, Tom Phoenix, brian d foy
- O'Reilly Media
- July 2005
Top
Chap 1 Introduction
What Does "Perl" Stand For?
- Perl is sometimes called:
- Practical Extraction and Report Language
- Pathologically Eclectic Rubbish Lister
- It's a retronym, not an acronym since Larry Wall came up with the name first and the expansion later. That's why "Perl" isn't in all caps.
- In general, "Perl" refers to the language and "perl" refers to the interpreter that compiles and runs your programs.
Why Did Larry Create Perl?
- Larry created Perl in the mid-1980s when he wanted to produce some reports from a Usenet news-like hierarchy of files for a bug-reporting system, and awk ran out of steam.
Why Didn't Larry Just Use Some Other Language?
- At the time, Larry didn't see anything that met his needs. He needed something:
- with the quickness of coding available in shell or awk programming
- with some of the power of more advanced tools like grep, cut, sort, and sed
- without having to resort to a language like C.
- Perl fills the gap between low-level programming (such as in C or C++ or assembly) and high-level programming (such as shell programming).
- Low-level programming: usually hard to write and is ugly but fast and unlimited;
it's hard to beat the speed of a well-written low-level program on a given machine. There, you can do almost anything. - High-level programming: tends to be slow, hard, ugly, and limited;
there are many things you can't do with the shell or batch programming if there's no command on your system that provides the needed functionality.
- Perl is easy, nearly unlimited, mostly fast, and kind of ugly:
- easy: This means it's easy to use. It's not especially easy to learn
- nearly unlimited: most things that ordinary folks need most of the time are good tasks for Perl
- mostly fast: If someone wants to add a feature that would be cool, but it would slow down other programs,
Larry is almost certain to refuse the new feature until we find a way to make it quick enough. - kind of ugly: O'Reilly's symbol for Perl is the camel.
Camels get the job done despite all difficulties even when they look bad and smell worse and sometimes spit at you. Perl is a little like that.
Is Perl Easy or Hard?
- Perl is easy to use, but sometimes hard to learn. That's because you'll learn Perl only once, but you'll use it again and again.
- Without using Perl's defaults and shortcuts, that snippet would be roughly ten or twelve times longer, so it would take longer to read and write.
It would be harder to maintain and debug, too, with more variables. If you know some Perl, and you don't see the variables in that code, that's part of the point. They're all being used by default. But to have this ease at the programmer's tasks means paying the price when you're learning; you have to learn those defaults and shortcuts. - Perl's "contractions" abbreviate common "phrases" so that they can be "spoken" quicker and understood by the maintainer
as a single idiom, rather than a series of unrelated steps. The Perl code is dense; a Perl program may be around a quarter to three-quarters as long as the corresponding program in C. This makes Perl faster to write, read, debug, and maintain. You don't have to keep scrolling back and forth to see what's going on. - Perl can be "write-only" in that it's possible to write programs impossible to read. But with proper care, you can avoid this common accusation.
What Is Perl Good For?
Perl is optimized for problems that are about 90% working with text and about 10% everything else.
What Is Perl Not Good For?
You shouldn't choose Perl for making an opaque binary.
That's a program that you could give away or sell to someone who then can't see your secret algorithms in the source.
When you give people your Perl program, you'll normally be giving them the source and not an opaque binary.
How Can I Get Perl?
What Is CPAN?
How Can I Get Support for Perl?
- Perl Mongers: a worldwide association of Perl users' groups; see http://www.pm.org/ for more information.
- documentation:
- If you need to ask a question, there are newsgroups on Usenet (comp.lang.perl.*) and any number of mailing lists.
At any hour of the day or night, there's a Perl expert awake in some time zone answering questions on Usenet's Perl newsgroups; If you ask a question, you'll often get an answer within minutes. If you didn't check the documentation and FAQ first, you'll get flamed within minutes. - A few web communities have sprung up around Perl discussions.
How Do I Make a Perl Program?
- Perl programs are text files; you can create and edit them with your favorite text editor.
- In some cases, you may need to compose the program on one machine and transfer it to another to run it.
If you do this, be sure that the transfer uses text or ASCII mode and not binary mode. This step is needed because of the different text formats on different machines.
A Simple Program
| #!/usr/bin/perl print "Hello, world!\n"; |
- You can generally save that program under any name you wish. Perl doesn't require any special kind of filename or extension
- You may need to do something so your system knows it's an executable program: chmod a+x my_program
If the bug is that you didn't use chmod correctly, you'll probably get a "permission denied" message from your shell. - Now you're ready to run it: ./my_program
The dot and slash (./) at the start of this command mean to find the program in the current working directory.
What's Inside That Program?
- In Perl, comments run from a pound sign (#) to the end of the line. (There are no block comments in Perl.)
- That first line is a special comment.
On Unix systems, if the first two characters on the first line of a text file are #!, then what follows is the name of the program that executes the rest of the file. In this case, the program is stored in the file /usr/bin/perl. - This #! line is the least portable part of a Perl program because you'll need to find out what goes there for each machine.
Fortunately, it's almost always /usr/bin/perl or /usr/local/bin/perl. If that's not it, you might use a shebang line that finds perl for you: #!/usr/bin/env perl - On non-Unix systems, it's traditional (and even useful) to make the first line say #!perl.
- If that #! line is wrong, you'll generally get an error from your shell.
This may be something unexpected, like "file not found." It's not your program that's not found though; it's /usr/bin/perl that wasn't where it should have been. - Another problem you could have is if your system doesn't support the #! line at all.
In that case, your shell (or whatever your system uses) will probably run your program by itself, with results that may disappoint or astonish you. - The "main" program consists of all of the ordinary Perl statements. There's no "main" routine, as there is in languages like C or Java.
There's no required variable declaration section as there is in some other languages. (It allows us to write quick-and-dirty Perl programs.) If you want to declare your variables, that's a good thing; you'll see how to do that in Chapter 4.
How Do I Compile Perl?
- The perl interpreter compiles and then runs your program in one user step: perl my_program
- When you run your program, Perl's internal compiler first runs through your entire source, turning it into internal bytecode,
which is an internal data structure representing the program. Perl's bytecode engine takes over and runs the bytecode. - If you have a loop that runs 5,000 times, it's compiled once; the loop can then run at top speed.
And there's no runtime penalty for using as many comments and as much whitespace as you need to make your program easy to understand. If you use calculations involving only constants the result will be a constant computed once as the program is beginning, not each time through a loop. - To be sure, this compilation does take time.
It's inefficient to have a voluminous Perl program that does one small quick task and then exits because the runtime for the program will be dwarfed by the compile time. But the compiler is fast; normally the compilation will be a tiny percentage of the runtime. - An exception might be if you were writing a program run as a CGI script, where it may be called hundreds or thousands of times every minute.
Many of these programs have short runtimes, so the issue of recompilation may become significant. If this is an issue for you, you'll want to find a way to keep your program in memory between invocations. The mod_perl extension to the Apache web server http://perl.apache.org or Perl modules like CGI::Fast can help you.
A Whirlwind Tour of Perl
| #!/usr/bin/perl @lines = `perldoc -u -f atan2`; foreach (@lines) { s/w<([^>]+)>/U$1/g; print; } |
Top
Chap 2 Scalar Data
Numbers
All Numbers Have the Same Format Internally
Floating-Point Literals
| 1.25 255.000 255.0 7.25e45 # 7.25 times 10 to the 45th power (a big number) -6.5e24 # negative 6.5 times 10 to the 24th (a big negative number) -12e-24 # negative 12 times 10 to the -24th (a very small negative number) -1.2E-23 # another way to say that - the E may be uppercase |
- A literal is the way a value is represented in the source code of the Perl program.
A literal is not the result of a calculation or an I/O operation; it's data written directly into the source code. - Perl's floating-point literals:
Numbers with and without decimal points are allowed (including an optional plus or minus prefix), as well as attaching a power-of-10 indicator (exponential notation) with E notation.
Integer Literals
| 0 2001 -40 61_298_040_283_768 # Perl allows underscores for clarity within integer literals |
Non-Decimal Integer Literals
| 0377 # 377 octal, same as 255 decimal 0xff # FF hex, also 255 decimal 0b11111111 # also 255 decimal 0x1377_0B77 # Perl allows underscores for clarity within these literals 0x50_65_72_7C # Perl allows underscores for clarity within these literals |
- Perl allows you to specify numbers in bases other than 10 (decimal):
- Octal (base 8): start with a leading 0
- hexadecimal (base 16): start with a leading 0x
- binary (base 2): start with a leading 0b
- It makes no difference to Perl whether you write 0xFF or 255.000, so choose the representation that makes the most sense to you and your maintenance programmer.
- The "leading zero" indicator works only for literals and not for automatic string-to-number conversion, which you'll see later in this chapter.
You can convert a data string that looks like an octal or hex value into a number with oct( ) or hex( ). Though there's no "bin" function for converting binary values, oct( ) can do that for strings beginning with 0b.
Numeric Operators
| 14 / 2 # 14 divided by 2, or 7 10.2 / 0.3 # 10.2 divided by 0.3, or 34 10 / 3 # always floating-point divide, so 3.3333333... 10 % 3 # the remainder when 10 is divided by 3, which is 1 10.5 % 3.2 # Both values are first reduced to their integer values, is computed as 10 % 3. 2**3 # two to the third power, or 8 |
- Perl provides the typical ordinary addition, subtraction, multiplication, division, modulus (%) and FORTRAN-like exponentiation (**) operators
- Both values of modulus operator are first reduced to their integer values.
The result of a modulus operator when a negative number (or two) is involved can vary between Perl implementations. Beware. - You can't normally raise a negative number to a noninteger exponent. Math geeks know that the result would be a complex number.
To make that possible, you'll need the help of the Math::Complex module.
Strings
- Perl uses length counting, not a null byte, to determine the end of the string
- The shortest possible string has no characters. The longest string fills all of your available memory
This is in accordance with the principle of "no built-in limits" that Perl follows at every opportunity - The ability to have any character in a string means you can create, scan, and manipulate raw binary data as strings
and that is something with which many other utilities would have great difficulty. For example, you could update a graphical image or compiled program by reading it into a Perl string, making the change, and writing the result back out.
Single-Quoted String Literals
| 'fred' # those four characters: f, r, e, and d '' # the null string (no characters) 'Don\'t let an apostrophe end this string prematurely!' 'the last character of this string is a backslash: \\' 'hello\n' # hello followed by backslash followed by n 'hello there' # hello, newline, there (11 characters total) '\'\\' # single quote followed by backslash |
- A single-quoted string literal is a sequence of characters enclosed in single quotes.
The single quotes are not part of the string itself but are there to let Perl identify the beginning and the ending of the string. Any character other than a single quote or a backslash between the quote marks (including newline characters) stands for itself inside a string. - To get a backslash, put two backslashes in a row; to get a single quote, put a backslash followed by a single quote
- The \n within a single-quoted string is not interpreted as a newline but as the two characters backslash and n.
- Only when the backslash is followed by another backslash or a single quote does it have special meaning.
Double-Quoted String Literals
| "barney" # just the same as 'barney' "hello world\n" # hello world, and a newline "The last character of this string is a quote mark: "" "coke\tsprite" # coke, a tab, and sprite |
| \n | Newline | \r | Return | | \t | Tab | \f | Formfeed | | \b | Backspace | \a | Bell | | \e | Escape (ASCII escape character) | \cC | A "control" character (here, Ctrl-C) | | \x7f | Any hex ASCII value (here, 7f = delete) | \007 | Any octal ASCII value (here, 007 = bell) | | \\ | Backslash | " | Double quote | | \l | Lowercase next letter | \L | Lowercase all following letters until \E | | \u | Uppercase next letter | \U | Uppercase all following letters until \E | | \Q | Quote non-word characters by adding a backslash until \E | \E | End \L, \U, or \Q |
String Operators
- concatenation operator (.): doesn't alter either string. The resulting string is then available for further computation or assignment to a variable
| "hello" . "world" # same as "helloworld" "hello" . ' ' . "world" # same as 'hello world' 'hello world' . "\n" # same as "hello world\n" |
- repetition operator (x): takes its left operand (a string) and makes as many concatenated copies of that string as indicated by its right operand (a number)
The string repetition operator wants a string for a left operand, so the number 5 is converted to the string "5" The copy count (the right operand) is first truncated to an integer value (4.8 becomes 4) before being used. A copy count of less than one results in an empty (zero-length) string.
| "fred" x 3 # is "fredfredfred" "barney" x (4+1) # is "barney" x 5, or "barneybarneybarneybarneybarney" 5 x 4 # is really "5" x 4, which is "5555" 5 x 4.8 # is really "5" x 4 (4.8 is first truncated to an integer value, 4) 5 x 0.8 # is really "5" x 0, that is empty (zero-length) string |
Automatic Conversion Between Numbers and Strings
| "12" * "3" # gives the value 36 ('*' operator needs numeric values, Perl converts "12" and "3" to numbers) "12fred34" * " 3" # gives the value 36 (trailing nonnumber stuff and leading whitespace are discarded) "Z" . 5 * 7 #same as "Z" . 35, or "Z35" (the numeric value expands into whatever string would have been printed for that number) |
- Perl automatically converts between numbers and strings as needed.
- How does it know whether a number or a string is needed? It all depends on the operator being used on the scalar value.
If an operator expects a number (as + does), Perl will see the value as a number. If an operator expects a string (like . does), Perl will see the value as a string. You don't need to worry about the difference between numbers and strings; use the proper operators, and Perl will make it all work. - When a string value is used where an operator needs a number (say, for multiplication),
Perl automatically converts the string to its equivalent numeric value as if it had been entered as a decimal floating-point value. So "12" * "3" gives the value 36. (The trick of using a leading zero to mean a non-decimal value works for literals but never for automatic conversion. Use hex( ) or oct( ) to convert those kinds of strings.) - Trailing nonnumber stuff and leading whitespace are discarded, so "12fred34" * " 3" will give 36 without any complaints.
At the extreme end of this, something that isn't a number at all converts to zero. This would happen if you used the string "fred" as a number. - If a numeric value is given when a string value is needed, the numeric value expands into whatever string would have been printed for that number.
Perl's Built-in Warnings
Warnings
- Perl can be told to warn you when it sees something suspicious going on in your program.
- Now, Perl will warn you if you use '12fred34' as if it were a number: Argument "12fred34" isn't numeric
- Warnings won't change the behavior of your program except that now it will emit gripes once in a while.
Diagnostics
- If you get a warning message you don't understand, you can get a longer description of the problem with the diagnostics pragma: use diagnostics;
- When you add the use diagnostics pragma to your program, it may seem to you that your program now pauses for a moment whenever you launch it.
That's because your program has to do a lot of work in case you want to read the documentation as soon as Perl notices your mistakes, if any. - This leads to a nifty optimization that can accelerate your program's launch (and memory footprint) with no adverse impact on users,
once you no longer need to read the documentation about the warning messages produced by your program, remove the use diagnostics pragma. - A further optimization can be had by using one of Perl's command-line options, -M,
to load the pragma only when needed instead of editing the source code each time to enable and disable diagnostics: perl -Mdiagnostics ./my_program
Scalar Variables
- A variable is a name for a container that holds one or more values.
(As you'll see, a scalar variable can hold only one value. But other types of variables, such as arrays and hashes, may hold many values.) - Scalar variable names begin with a dollar sign ($) followed by what we'll call a Perl identifier:
a letter or underscore, and then possibly more letters, or digits, or underscores. (Another way to think of it is that it's made up of alphanumerics and underscores but can't start with a digit) Uppercase and lowercase letters are distinct - Scalar variables in Perl are always referenced with the leading $.
Choosing Good Variable Names
- You should generally select variable names that mean something regarding the purpose of the variable.
- A variable used for only two or three lines close together may be called something like $n, but a variable used throughout a program should probably have a more descriptive name.
- Similarly, properly placed underscores can make a name easier to read and understand, especially if your maintenance programmer has a different spoken language background:
- Most variable names in our Perl programs are all lowercase like most of the ones you'll see in this book.
In a few special cases, uppercase letters are used. Using all caps (like $ARGV) generally indicates that there's something special about that variable. When a variable's name has more than one word, some say $underscores_are_cool while others say $giveMeInitialCaps. Just be consistent. - Of course, choosing good or poor names makes no difference to Perl.
Scalar Assignment
Binary Assignment Operators
+=, -=, *=, /=, **=, %=, .=(string concatenation)
Output with print
| print "hello world\n"; # say hello world, followed by a newline print "The answer is ", 6 * 7, ".\n"; # a series of values separated by commas |
- print operator takes a scalar argument and puts it out without any embellishment onto standard output
- You can give print a series of values, separated by commas
Interpolation of Scalar Variables into Strings
| $meal = "brontosaurus steak"; $barney = "fred ate a $meal"; # $barney is now "fred ate a brontosaurus steak" $barney = 'fred ate a ' . $meal; # another way to write that (can get the same results without the double quotes) | | $barney = "fred ate a $meat"; # $barney is now "fred ate a " (has never been given a value (undef) , empty string is used instead) | | print "$fred"; # unneeded quote marks print $fred; # better style (string variable itself is scalar data) | | print "The name is \$fred.\n"; # prints a dollar sign print 'The name is $fred' . "\n"; # so does this | | $what = "brontosaurus steak"; $n = 3; print "fred ate $n $whats.\n"; # not the steaks, but the value of $whats print "fred ate $n ${what}s.\n"; # now uses $what print "fred ate $n $what" . "s.\n"; # another way to do it print 'fred ate ' . $n . ' ' . $what . "s.\n"; # an especially difficult way |
- When a string literal is double-quoted, it is subject to variable interpolation besides being checked for backslash escapes.
This means that any scalar variabl name (and some other variable types) in the string is replaced with its current value - Variable interpolation is also known as double-quote interpolation because it happens when double-quote marks (but not single quotes) are used.
- You can get the same results without the double quotes, but the double-quoted string is often the more convenient way to write it.
- If the scalar variable has never been given a value (undef) , the empty string is used instead
- Don't bother with interpolating if you have the one lone variable (string variable itself is scalar data)
There's nothing wrong with putting quote marks around a lone variable, but the other programmers will laugh at you behind your back. (a waste of typing) - To put a real dollar sign into a double-quoted string, precede the dollar sign with a backslash (or use single-quoted), which turns off the dollar sign's special significance
- If you want to follow the replaced value immediately with some constant text that begins with a letter, digit, or underscore,
Perl would consider those characters as additional name characters, which is not what you want. - Perl provides a delimiter for the variable name in a manner similar to the shell.
Enclose the name of the variable in a pair of curly braces. Or you can end that part of the string and start another part of the string with a concatenation operator
Operator Precedence and Associativity
- You can override the default precedence order by using parentheses. Anything in parentheses is completely computed before the operator outside of the parentheses is applied.
- Use parentheses when you don't remember the order of operations or when you're too busy to look in the chart.
Comparison Operators
| Comparison | Numeric | String | | Equal | == | eq | | Not equal | != | ne | | Less than | < | lt | | Greater than | > | gt | | Less than or equal to | <= | le | | Greater than or equal to | >= | ge |
| '35' eq '35.0' # false (comparing as strings) ' ' gt '' # true |
The if Control Structure
| if ($name gt 'fred') { print "'$name' comes after 'fred' in sorted order.\n"; } else { print "'$name' does not come after 'fred'.\n"; print "Maybe it's the same string, in fact.\n"; } |
- Those block curly braces ({}) are required around the conditional code (unlike C).
Boolean Values
Getting User Input
| $line = <STDIN>; if ($line eq "\n") { print "That was just a blank line!\n"; } else { print "That line of input was: $line"; } |
The chomp Operator
| chomp($text = <STDIN>); # Read the text, without the newline character |
The while Control Structure
| $count = 0; while ($count < 10) { $count += 2; print "count is now $count\n"; # Gives values 2 4 6 8 10 } |
- the block curly braces are required
The undef Value
| $n = 1; while ($n < 10) { $sum += $n; $n += 2; # On to the next odd number } print "The total was $sum.\n"; | $sum is not initialized, but undef acts like 0 | | $string .= "more text\n"; | $string is not initialized, but undef acts like empty string | | $madonna = <STDIN>; if ( defined($madonna) ) { print "The input was $madonna"; } else { print "No input available!\n"; } | <STDIN> will return undef when there is no more input use define to tell a value is undef or not | | $madonna = undef; | make our own undef, as if it had never been touched |
Top
Chap 3 Lists and Arrays
- A list is an ordered collection of scalars. An array is a variable that contains a list.
To be accurate, the list is the data, and the array is the variable. You can have a list value that isn't in an array, but every array variable holds a list. - The elements of an array or list are indexed by small integers starting at zero and counting by ones
(In early Perl, it was possible to change the starting number of array and list indexing. Look up the $[ variable in the perlvar manpage.) - Since each element is an independent scalar value, a list or array may hold numbers, strings, undef values, or any mixture of different scalar values.
Nevertheless, it's most common to have all elements of the same type - Arrays and lists can have any number of elements. (Perl's philosophy of "no unnecessary limits")
The smallest one has no elements, and the largest can fill all of the available memory.
Accessing Elements of an Array
| $rocks[0] = 'bedrock'; # One element... $rocks[1] = 'slate'; # another... $rocks[2] = 'lava'; # and another... $rocks[3] = 'crushed rock'; # and another... $rocks[99] = 'schist'; # now there are 95 undef elements | | $end = $#rocks; # 99, which is the last element's index $number_of_rocks = $end + 1; # okay, but you'll see a better way later $rocks[ $#rocks ] = 'hard rock'; # the last rock | | $rocks[ -1 ] = 'hard rock'; # easier way to do that last example $dead_rock = $rocks[-100]; # gets 'bedrock' $rocks[ -200 ] = 'crystal'; # fatal error! |
- The array name is from a completely separate namespace than scalars use.
You could have the same scalar variable in the same program. Perl treats them as different things and doesn't get confused. (Though your maintenance programmer might be confused) - The subscript may be any expression that gives a numeric value.
If it's not an integer, it'll automatically be truncated to the next lower integer - If the subscript indicates an element that would be beyond the end of the array, the corresponding value will be undef.
This is the same as ordinary scalars; if you've never stored a value into the variable, it's undef. - If you store in an array an element that is beyond the end of the array, the array is automatically extended as needed.
If Perl needs to create the intervening elements, it creates them as undef values. There's no limit on its length as long as there's available memory for Perl to use. - The last element index is $#array_name. That's not the same as the number of elements because there's an element number zero.
- Negative array indices count from the end of the array.
List Literals
| (1, 2, 3) # list of three values 1, 2, and 3 (1, 2, 3,) # the same three values (the trailing comma is ignored) ("fred", 4.5) # two values, "fred" and 4.5 ( ) # empty list - zero elements (1..100) # list of 100 integers | | (1..5) # same as (1, 2, 3, 4, 5) (1.7..5.7) # same thing - both values are truncated (5..1) # empty list - .. only counts "uphill" (0, 2..6, 10, 12) # same as (0, 2, 3, 4, 5, 6, 10, 12) ($m..$n) # range determined by current values of $m and $n (0..$#rocks) # the indices of the rocks array from the previous section | | ($m, 17) # two values: the current value of $m, and 17 ($m+$o, $p+$q) # two values | | ("fred", "barney", "betty", "wilma", "dino") qw( fred barney betty wilma dino ) # same as above, but less typing |
- .. (range operator): creates a list of values by counting from the left scalar up to the right scalar by ones
- The elements of a list literal are not necessarily constantsthey can be expressions that will be newly evaluated each time the literal is used
- qw (quoted words or quoted by whitespace) : makes it easy to generate them without typing a lot of extra quote marks
List Assignment
| ($fred, $barney, $dino) = ("flintstone", "rubble", undef); ($betty[0], $betty[1]) = ($betty[1], $betty[0]); | | @rocks = qw/ bedrock slate lava /; @tiny = ( ); # the empty list @giant = 1..1e5; # a list with 100,000 elements @stuff = (@giant, undef, @giant); # a list with 200,001 elements $dino = "granite"; @quarry = (@rocks, "crushed rock", @tiny, $dino); five-element list (bedrock, slate, lava, crushed rock, granite) | | @copy = @quarry; # copy a list from one array to another |
- Since the list is built up before the assignment starts, this makes it easy to swap two variables' values in Perl
- But what happens if the number of variables (on the left side of the equals sign) isn't the same as the number of values (from the right side)?
- In a list assignment, extra values are silently ignored.
- if you have too many variables, the extras get the value undef.
- Just use @ before the name of the array (and no index brackets after it) to refer to the entire array at once.
You can read this as "all of the" so @rocks is "all of the rocks" (Larry claims that he chose the $ and @ because they can be read as $calar (scalar) and @rray (array).) - The value of an array variable that has not yet been assigned is ( ), the empty list.
- An array name in the list is replaced by the list it contains
An array doesn't become an element in the list because these arrays can contain only scalars, not other arrays.
The pop and push Operators
| @array = 5..9; $fred = pop(@array); # $fred gets 9, @array now has (5, 6, 7, 8) $barney = pop @array; # $barney gets 8, @array now has (5, 6, 7) pop @array; # @array now has (5, 6). (The 7 is discarded.) | | push(@array, 0); # @array now has (5, 6, 0) push @array, 8; # @array now has (5, 6, 0, 8) push @array, 1..10; # @array now has those 10 new elements @others = qw/ 9 0 2 1 0 /; push @array, @others; # @array now has those five new elements (19 total) |
The shift and unshift Operators
| @array = qw# dino fred barney #; $m = shift(@array); # $m gets "dino", @array now has ("fred", "barney") $n = shift @array; # $n gets "fred", @array now has ("barney") shift @array; # @array is now empty $o = shift @array; # $o gets undef, @array is still empty unshift(@array, 5); # @array now has the one-element list (5) unshift @array, 4; # @array now has (4, 5) @others = 1..3; unshift @array, @others; # @array now has (1, 2, 3, 4, 5) |
The push and pop operators do things to the end of an array.
Similarly, the unshift and shift operators perform the corresponding actions on the start of the array
Interpolating Arrays into Strings
| @rocks = qw{ flintstone slate rubble }; print "quartz @rocks limestone\n"; # prints five rocks separated by spaces | | $email = "fred@bedrock.edu"; # WRONG! Tries to interpolate @bedrock $email = "fred\@bedrock.edu"; # Correct $email = 'fred@bedrock.edu'; # Another way to do that | | @fred = qw(hello dolly); $y = 2; $x = "This is $fred[1]'s place"; # "This is dolly's place" $x = "This is $fred[$y-1]'s place"; # same thing | | @fred = qw(eating rocks is wrong); $fred = "right"; # we are trying to say "this is right[3]" print "this is $fred[3]\n"; # prints "wrong" using $fred[3] print "this is ${fred}[3]\n"; # prints "right" (protected by braces) print "this is $fred"."[3]\n"; # right again (different string) print "this is $fred\[3]\n"; # right again (backslash hides it) |
- Like scalars, array values may be interpolated into a double-quoted string.
- Elements of an array are automatically separated by spaces upon interpolation
(The separator is the value of the special $" variable, which is a space by default.) - There are no extra spaces added before or after an interpolated array
- If put an email address (xxx@yyy) into a double-quoted string, remember to escape @ (@)
(or it may consider @yyy as an array variable)
- A single element of an array will be replaced by its value as you'd expect
- The index expression is evaluated as an ordinary expression, as if it were outside a string
- If you want to follow a simple scalar variable with a left square bracket, you need to delimit the it so it isn't considered part of an array reference
The foreach Control Structure
| @rocks = qw/ bedrock slate lava /; foreach $rock (@rocks) { $rock = "\t$rock"; # put a tab in front of each element of @rocks $rock .= "\n"; # put a newline on the end of each } print "The rocks are:\n", @rocks; # Each one is indented, on its own line |
- The foreach loop steps through a list of values, executing one iteration (time through the loop) for each value
- The control variable takes on a new value from the list for each iteration
- The control variable is not a copy of the list elementit actually is the list element.
That is, if you modify the control variable inside the loop, you'll be modifying the element in the original list - What is the value of control variable after the loop has finished? It's the same as it was before the loop started.
After the loop is done, the variable has the value it had before the loop or undef if it didn't have a value.
Perl's Favorite Default: $_
| foreach (1..10) { # Uses $_ by default print "I can count to $_!\n"; } | | $_ = "Yabba dabba doo\n"; print; # prints $_ by default |
- If you omit the control variable from the beginning of the foreach loop, Perl uses its favorite default variable, $_.
- Saving the programmer from the heavy labor of having to think up and type a new variable name.
The reverse Operator
| @fred = 6..10; @barney = reverse(@fred); # gets 10, 9, 8, 7, 6 @wilma = reverse 6..10; # gets the same thing, without the other array @fred = reverse @fred; # puts the result back into the original array |
- The reverse operator takes a list of values (which may come from an array) and returns the list in the opposite order
- reverse returns the reversed list; it doesn't affect its arguments. If the return value isn't assigned anywhere, it's useless
The sort Operator
| @rocks = qw/ bedrock slate rubble granite /; @sorted = sort(@rocks); # gets bedrock, granite, rubble, slate @back = reverse sort @rocks; # these go from slate to bedrock @rocks = sort @rocks; # puts sorted result back into @rocks @numbers = sort 97..102; # gets 100, 101, 102, 97, 98, 99 |
Scalar and List Context
| @people = qw( fred barney betty ); @sorted = sort @people; # list context: barney, betty, fred $number = 42 + @people; # scalar context: 42 + 3 gives 45 | | @list = @people; # a list of three people $n = @people; # the number 3 |
Using List-Producing Expressions in Scalar Context
| @backwards = reverse qw/ yabba dabba doo /; # gives doo, dabba, yabba $backwards = reverse qw/ yabba dabba doo /; # gives oodabbadabbay | | $fred = something; # scalar context @pebbles = something; # list context ($wilma, $betty) = something; # list context ($dino) = something; # still list context! |
- sort: in a scalar context always returns undef
- reverse: In a scalar context, it returns a reversed string (or reversing the result of concatenating all the strings of a list, if given one)
- Don't be fooled by the one-element list; it is a list context and not a scalar one.
Using Scalar-Producing Expressions in List Context
| @fred = 6 * 7; # gets the one-element list (42) @barney = "hello" . ' ' . "world"; | | @wilma = undef; # OOPS! Gets the one-element list (undef) # which is not the same as this: @betty = ( ); # A correct way to empty an array |
- If an expression doesn't normally have a list value, the scalar value is automatically promoted to make a one-element list
- Since undef is a scalar value, assigning undef to an array doesn't clear the array. The better way to do that is to assign an empty list
Forcing Scalar Context
| @rocks = qw( talc quartz jade obsidian ); print "How many rocks do you have?\n"; print "I have ", @rocks, " rocks!\n"; # WRONG, prints names of rocks print "I have ", scalar @rocks, " rocks!\n"; # Correct, gives a number |
<STDIN> in List Context
| chomp(@lines = <STDIN>); # Read the lines, not the newlines |
- in list context, <STDIN> operator returns all of the remaining lines up to the end of file.
Each line is returned as a separate element of the list - end-of-file when the input comes from the keyboard:
- On Unix and similar systems: Ctrl-D
- On DOS/Windows systems: Ctrl-Z
- if you give chomp an array holding a list of lines, it will remove the newlines from each item in the list
- The line input operator reads all of the lines, gobbling up lots of memory
a 400 MB file will typically take up at least 1GB of memory when read into an array. This is because Perl generally wastes memory to save time. (This is a good tradeoff: if you're short of memory, you can buy more; if you're short on time, you're hosed.) If the input is large, you should generally find a way to deal with it without reading it all into memory at once
Top
Chap 4 Subroutines
Defining a Subroutine
| sub marine { $n += 1; # Global variable $n print "Hello, sailor number $n!\n"; } |
Invoking a Subroutine
- Invoke (call) a subroutine from within any expression by using the subroutine name (with the ampersand)
Return Values
| sub sum_of_fred_and_barney { print "Hey, you called the sum_of_fred_and_barney subroutine!\n"; $fred + $barney; # That's the return value # print "Hey, I'm returning a value now!\n"; # If adds this line, "$fred + $barney" will not be the return value, but in void context } | | $fred = 3; $barney = 4; $wilma = &sum_of_fred_and_barney; # $wilma gets 7 |
- The subroutine is always invoked as part of an expression even if the result of the expression isn't being used.
When we invoked &marine earlier, we were calculating the value of the expression containing the invocation but then throwing away the result. - All Perl subroutines have a return value
Not all Perl subroutines have a useful return value, however. - Whatever calculation is last performed in a subroutine is automatically also the return value,
so be careful when adding additional code to a subroutine - void context is a fancy way of saying that the answer isn't being stored in a variable or used in any other way.
- The last expression evaluated really means the last expression evaluated, rather than the last line of text.
Arguments
| $n = &max(10, 15); # This sub call has two parameters | | sub max { # Compare this to &larger_of_fred_or_barney if ($_[0] > $_[1]) { $_[0]; } else { $_[1]; } } |
- To pass an argument list to the subroutine, place the list expression, in parentheses, after the subroutine invocation
- The list is passed to the subroutine; that is, it's made available for the subroutine to use however it needs to
- Perl automatically stores the argument list in the special array variable named @_ for the duration of the subroutine
The subroutine can access this variable to determine the number of arguments and the value of those arguments - This means that the first subroutine parameter is stored in $_[0], the second one is stored in $_[1], and so on.
- Excess parameters are ignored; Insufficient parameters are also ignored (simply get undef)
- The @_ variable is private to the subroutine;
if there's a global value in @_, it is saved before the subroutine is invoked and restored to its previous value upon return from the subroutine. - This means that a subroutine can pass arguments to another subroutine without fear of losing its own @_ variable.
Even if the subroutine calls itself recursively, each invocation gets a new @_, so @_ is always the parameter list for the current subroutine invocation.
Private Variables in Subroutines
| sub max { my($m, $n) = @_; # Name the subroutine parameters if ($m > $n) { $m } else { $n } } |
Variable-Length Parameter Lists
| sub max { if (@_ != 2) { print "WARNING! &max should get exactly two arguments!\n"; } # continue as before... } |
A Better &max Routine
| $maximum = &max(3, 5, 10, 4, 6); sub max { my($max_so_far) = shift @_; # the first one is the largest yet seen foreach (@_) { # look at the remaining arguments if ($_ > $max_so_far) { # could this one be bigger yet? $max_so_far = $_; } } $max_so_far; } |
- But in real-world Perl programming, this sort of check is rarely used; it's better to make the subroutine adapt to the parameters.
- the foreach loop will step through the remaining values in the parameter list from @_.
The control variable of the loop is, by default, $_. (But, remember, there's no automatic connection between @_ and $_; it's a coincidence that they have similar names.)
Notes on Lexical (my) Variables
| my($num) = @_; # list context, same as ($num) = @_; my $num = @_; # scalar context, same as $num = @_; | | my $fred, $barney; # WRONG! Fails to declare $barney my($fred, $barney); # declares both | | my @phone_number; |
- The lexical variables can be used in any block
- They are private to the enclosing block
If there's no enclosing block, the variable is private to the entire source file. - The scope of a lexical variable's name is limited to the smallest enclosing block or file.
This is a big win for maintainability, if the wrong value is found in them, the culprit will be found within a limited amount of source code. As experienced programmers have learned, limiting the scope of a variable to a page of code, or to a few lines of code, accelerates the development and testing cycle. - the my operator doesn't change the context of an assignment
- can't have two lexical variables with the same name declared in the same scope
- without the parentheses, my only declares a single lexical variable
The use strict Pragma
The return Operator
| my @names = qw/ fred barney betty dino wilma pebbles bamm-bamm /; my $result = &which_element_is("dino", @names); sub which_element_is { my($what, @array) = @_; foreach (0..$#array) { # indices of @array's elements if ($what eq $array[$_]) { return $_; # return early once found } } -1; # element not found (return is optional here) } | return a value immediately without executing the rest of the subroutine |
Omitting the Ampersand
- If the compiler sees the subroutine definition before invocation or if Perl can tell from the syntax that it's a subroutine call,
the subroutine can be called without an ampersand - if the subroutine has the same name as a Perl built-in, you must use the ampersand to call it.
With an ampersand, you're sure to call the subroutine; without it, you'd be calling the built-in chomp, even though we've defined the subroutine &chomp - Perl will usually be able to warn you about this when you have warnings turned on.
- A return with no arguments will return undef in a scalar context or an empty list in a list context.
This can be useful for an error return from a subroutine, signalling to the caller that a more meaningful return value is unavailable.
Non-Scalar Return Values
| sub list_from_fred_to_barney { if ($fred < $barney) { # Count upwards from $fred to $barney $fred..$barney; } else { # Count downwards from $fred to $barney reverse $barney..$fred; } } $fred = 11; $barney = 6; @c = &list_from_fred_to_barney; # @c gets (11, 10, 9, 8, 7, 6) | A scalar isn't the only kind of return value a subroutine may have. If you call your subroutine in a list context, it can return a list of values. |
Top
Chap 5 Input and Output
Input from Standard Input
| Shortcut | Original | | | while (<STDIN>) { print "I saw $_"; } | while (defined($line = <STDIN>)) { print "I saw $line"; } | <STDIN> operator will return undef when you reach end-of-file It works only if there's nothing but the line-input operator in the conditional of a while loop. If you put anything else into the conditional expression, this shortcut won't apply. | | foreach (<STDIN>) { print "I saw $_"; } | | evaluating <STDIN> in a list context gives you all of the (remaining) lines of input as a list, and each element of the list is one line |
- In the while loop, Perl reads a line of input, puts it into a variable, and runs the body of the loop.
Then, it goes back to find another line of input. - But in the foreach loop, the line-input operator is being used in a list context since foreach needs a list to iterate through.
So, it has to read all of the input before the loop can start running. - It's generally best to use code like the while loop's shortcut whenever possible, since it processes input one line at a time.
Input from the Diamond Operator
| $ ./my_program fred barney betty | runs the command my_program, and that it should process file fred, followed by file barney, followed by file betty. | | $ ./my_program fred- betty | the program should process file fred, followed by the standard input stream, followed by file betty | | while (<>) { chomp; print "It was $_ that I saw!\n"; } | $ ./my_prog fred barney It was [a line from file fred] that I saw!" It was [another line from file fred] that I saw! ...... ...... ...... [until it reaches the end of file fred.] [Then, it will automatically go on to file barney] It was [a line from file barney] that I saw!" It was [another line from file barney] that I saw! ...... ...... ...... [until it reaches the end of file barney.] |
- The invocation arguments to a program are normally a number of "words" on the command line after the name of the program.
In this case, they give the names of the files your program will process in sequence - If you give no invocation arguments, the program should process the standard input stream.
As a special case, if you give a hyphen as one of the arguments, that means standard input as well - The benefit of is that you may choose where the program gets its input at runtime
- The diamond operator (<>) is a special kind of line-input operator.
Instead of getting the input from the keyboard, it comes from the user's choice of input - There's no break when we go from one file to another;
when you use the diamond, it's as if the input files have been merged into one big file. The diamond will return undef at the end of all of the input. - If the diamond operator can't open one of the files and read from it,
it'll print an allegedly helpful diagnostic message, such as: can't open wimla: No such file or directory
The Invocation Arguments
- @ARGV is a special array preset by the Perl interpreter as the list of the invocation arguments.
when your program starts, @ARGV is already stuffed full of the list of invocation arguments. - C programmers may be wondering about argc(there isn't one in Perl),
and what happened to the program's own name (that's found in Perl's special variable $0, not @ARGV). - You can use @ARGV like any other array; you can shift items off of it or use foreach to iterate over it.
- The diamond operator looks in @ARGV to determine what filenames it should use.
This means that after your program starts and before you start using the diamond, you've got a chance to tinker with @ARGV.
Output to Standard Output
| print @array; | print "@array"; | | qw/ fred barney betty / | fredbarneybetty (view as a string) | fred barney betty (view as a list) | | if @array is a list of unchomped lines of input | fred barney betty | fred barney betty (has additional space indent) | | if your strings contain newlines, you may want to use this | if your strings don't contain newlines, you may want to use this |
| print <>; # alomost source code for Unix command 'cat' | | print sort <>; # almost source code for Unix command 'sort' |
- It's normal for your program's output to be buffered.
Instead of sending out every little bit of output immediately, it'll be saved until there's enough to bother with. - Generally, the output will go into a buffer that is flushed (that is, actually written to disk or wherever)
only when the buffer gets full or when the output is otherwise finished (such as at the end of runtime). - Since print is looking for a list of strings to print, its arguments are evaluated in list context.
Since the diamond operator will return a list of lines in a list context, these can work well together. - Remember the rule that parentheses in Perl may be omitted except when doing so would change the meaning of a statement.
| print("Hello, world!\n"); print "Hello, world!\n"; | same |
| print (2+3); print (2+3)*4; | 5 Ooops! (looks like a function call, it is a function call: print(2+3))(print return true or false) |
- If print is followed by an open parenthesis, ensure the corresponding closed parenthesis comes after all of the arguments to that function.
Formatted Output with printf
Book_LearnPerl
|
|
Tip: To turn text into a link, highlight the text, then click on a page or file from the list above.
|
|
|
|
|
Comments (0)
You don't have permission to comment on this page.