getline - Computing Science and Mathematics

Unix and Software Tools (P51UST)
Awk Programming (2)
Ruibin Bai (Room AB326)
Division of Computer Science
The University of Nottingham Ningbo, China
P51UST: Unix and Software Tools
1
This Lecture
• Awk commands
• Loops and conditionals
• Arrays
• Functions
P51UST: Unix and Software Tools
2
Awk Commands
• Types of commands
– Assignments of variables or arrays
– Input/output
– String operations
– Control-flow commands
– User-defined functions (commands)
P51UST: Unix and Software Tools
3
Assignment
• User defined awk variables are initialised to either
zero or the empty string, depending on how they are
used.
• Assign variables with an =, E.g.
– FS = “:”
– var = count+2
– var = max-min
• Assignment syntax is less strict
– Can have space before and/or after equal to sign ‘=’
P51UST: Unix and Software Tools
4
Input/output (1)
• print function
– print [argument] [destination]
– Used to print one or more variables, fields, or strings
to standard out.
– If arguments separated by space, output will be
concatenated, if separated by commas, output will be
separated by OFS (output field separator).
– Strings are enclosed by qutation marks.
– The output can be redirected to files
Print $1, $2, $4 > “myfile”
Input/output (2)
• Formatted printing: printf
• Syntax
printf ([format [, values]])
• Very similar syntax to printf in sh.
printf ("*%-10s*%-5d*%+5d\n","hello",10,10)
Output:
*hello
*10
*
+10
Input/output (3)
• getline - Read a string from keyboard or from a file
getline [variable string] [<input file] or
command | getline [variable string]
• Example: target file is not specified
{
printf(“Please enter two values > ”)
getline
printf(“$1 = %s\t$2 = %s\n”, $1, $2)
}
getline
• Reading from keyboard
getline var-name <“-” or
getline var-name <“/dev/stdin” or
getline var-name <“/dev/tty”
• Example
BEGIN{
printf (“Please enter the name I should
search for: > “)
getline name < “/dev/tty”
}
$1 == name || $2==name || $3 == name {
printf("$1 = %s\t$2 = %s\t$3 = %s\n", $1,
$2, $3)
}
A Note About getline
• getline is a function and does return a value, BUT if
you put brackets after it, e.g.:
– getline()
– You will get an error!
• Examples
Here, the input
record is assigned
to the variable
“newValue”
– getline newValue < “myFile”
In this example, the user
BEGIN {printf “Enter a name:>“ is prompted to enter their
name. This is assigned to
getline < “-”
$0 and the print
print
statement outputs the
}
value of $0 by default
P51UST: Unix and Software Tools
9
Control-Flow Commands (1)
• Conditionals
if (condition) {
statement1
statement2
…
}
else {
statement3
statement4
…
}
• Conditional operators
<
less than
<=
less than or equal to
==
equal to
>
greater than
>=
greater than or equal
!=
not equal to
~ /re/ contains the regular
expression re.
Control-Flow Commands (2)
• For loops
for (x= start; x<=maximum; x++){
command(s)
}
Or
for (element in array) {
command(s)
}
for loop example
• BEGIN{
for (x=1; x<=10; x++)
print x
}
Control-flow Commands (3)
• While loops
BEGIN {
while (name==“”){
printf(“Give me a name please >”)
getline name <“/dev/stdin”
}
}
$1==name || $2==name {
printf(“here are the data you requested:
\n\n”)
printf(“\t%s\n\n”,$0)
}
Control-flow Commands (4)
• break
– Used to exit from a loop
• continue
– Skip the current body of a loop to the next loop
• next
– The next statement forces awk to immediately stop
processing the current record and go on to the next
record.
• nextfile
– Skip the remainder of an input file and go on to the
next input file
Arrays in Awk
• awk has arrays with elements subscripted with
numbers or strings (associative arrays)
• Assign arrays in one of two ways:
– Name them in an assignment statement
myArray[i]=n++
– Use the split() function (to be discussed shortly)
n=split(input, words, " ")
Array in Awk (2)
• Under awk, it's customary to start array indices at
1, rather than 0.
myarray[1]="jim“
myarray[2]=456
• Array elements can be subscripted with number or
string.
BEGIN {
my_array[1] = "pear"
my_array[2] = "tree";
my_array["David"] = "Cassidy";
}
P51UST: Unix and Software Tools
16
Reading Elements in an Array
• Using a for loop:
for (item in array)
print array[item]
• Using the operator in:
if (index in array)
…
• …use this to see if an element exists. It does so by
testing to see if its index exists (nawk)
An Array Example
BEGIN {
my_array[1] = "Partridge"
my_array[2] = "pear"
my_array[3] = "tree"
my_array[13] = "Cassidy"
print "Print array element using item-in-array for loop:"
for (i in my_array) print i "=" my_array[i]
print "\nPrint array element using c-style for loop:"
min=1; max=13
for (i=min; i<= max; i++) {
if (i in my_array)
print i "=" my_array[i]
}
}
P51UST: Unix and Software Tools
18
An Array Example
BEGIN {
A value can be stored
my_array[1] = "Partridge"
at any index
my_array[2] = "pear"
my_array[3] = "tree"
my_array[13] = "Cassidy"
print "Print array element using item-in-array for loop:" Elements are
for (i in my_array) print i "=" my_array[i]
not printed in
order here
print "\nPrint array element using c-style for loop:"
min=1; max=13
for (i=min; i<= max; i++) {
if (i in my_array)
Test whether a value has
print i "=" my_array[i]
ever stored for a index value
}
}
P51UST: Unix and Software Tools
19
Copying an Array
• The awk language does not support assignment of arrays.
• Thus, to copy an array, you must copy the individual values
from one array to the next.
BEGIN {
arr_len = split( "Mary lamb freezer", my_array );
for (word in my_array) {
copy_array[word] = my_array[word]
}
for (word in copy_array) {
print copy_array[word]
}
}
P51UST: Unix and Software Tools
20
Delete an Array Element
• Syntax
Delete array_name[key]
• Example
BEGIN {
my_array["purple"] = "Partridge";
my_array["mountain"] = "pear";
my_array["majesties"] = "tree";
my_array["fruited"] = "Cassidy";
mykey = "fruited";
delete my_array["mountain"];
delete my_array[mykey];
for (i in my_array) { print i "=" my_array[i]; }
}
P51UST: Unix and Software Tools
21
String Functions (1)
• length ([argument])
– Return the length of the argument
• index (string, target)
– Return the location or byte posion of the first byte of
the target string within the whole string.
• substr (string, start [, length])
– Return a substring of the whole string, starting at
start
• split (string, array [, separator])
– Splits the string into many words and stores into array.
String Functions (2)
•
Assume the following target file (/etc/passwd)
1. Username
5. User ID Info,
2. password,
6. Home directory,
3. User ID (UID) ,
7. Command/shell
4. Group ID (GID),
String Functions (3)
• Print each of users’ login and first name using index
and substr functions
BEGIN{
print "Here are the user ID\'s and first
names from /etc/passwd"
FS=":"
}
{
blank = index($5," ")
first = substr($5, 1, blank-1)
printf("User ID = %-15s \t
first name = %-25s\n", $1, first)
}
String Functions (4)
• Using function split to print each of users’ login,
first name and last name.
BEGIN{
FS=":"
}
{
howmany= split($5, names, " ")
printf("User ID = %-15s firstname = %-15s
lastname = %-15s\n",
$1, names[1],names[howmany])
}
The system() Function
• The system() function allows a programmer to
execute a command whilst within an awk script.
system(“cmd”)
• The awk script waits for the command to finish
before continuing execution.
• The output of the command is NOT available for
processing from within awk.
• The system() function returns an exit status which
can be tested by the awk script.
An Example Using system()
BEGIN {
if (system(“mkdir UST”) == 0)
{
if (system(“cd UST”) !=0)
print “change directory –
failed”
}
else
print “make directory failed”
This example tries to
create a new directory
called UST. If successful,
the code tries to change
directory to UST. If not, an
error is printed.
}
P51UST: Unix and Software Tools
27
An Example Using system()
Here, the script (called
create.awk) is run and is
successful. “ls UST”
doesn’t return anything
because UST is empty.
$ awk -f create.awk
$ ls UST
$ awk -f create.awk
mkdir: UST: File exists
make directory - failed
Here, the script is run for a
second time and so the mkdir
command fails because UST
already exists. The first error
is given by the mkdir
command, the second error
is given by the awk script
P51UST: Unix and Software Tools
28
User-Defined Functions
• You can define your own functions in awk, in much
the same way as you define a function in C or Java
– Thus code that is to be repeated can be grouped
together inside a function
– Allows code reuse!
– NOTE: when calling a function you have defined
yourself, no space is allowed between the function
name and the opening bracket.
An Example using a Function and an
Array
# capitalise the first letter of each word in a string
function capitalise(input)
{
result= ""
n=split(input, words, " ")
for (i=1; i <=n; i++)
{
w = words[i]
w = toupper(substr(w, 1, 1)) substr(w, 2)
if (i > 1)
result = result " "
result = result w
}
return result
}
# this is the main program
{ print capitalise($0) }
P51UST: Unix and Software Tools
30
Break-down of Example
# capitalise the first letter of each word in a string
function capitalise(input)
{
…
Variable to be used in function
- input contains whatever the
caller called the function with
Break-down of Example (2)
…
result= ""# Set result to be an empty string
n=split(input, words, " ")
…
n is the result returned by
the split command and
contains the number of
elements in the array
“words”
Take the input and split it
up into the array “words”
- divide the input wherever
there is a space
Break-down of Example (3)
Take the substring
which starts at the
first character and
has a length of 1 and
capitalise using
for (i=1; i <=n; i++)toupper()
For each element of
array from 1 to the
number of elements…
…
Take remainder of
string starting at 2nd
character and append
it to capitalised
character
{
Assign
element to w
w = words[i]
w = toupper(substr(w, 1, 1)) substr(w, 2)
if (i > 1)
Tag a space on to the
result = result " "
end of the result string
result = result w
}
return result
}
…
Tag the next word on to the
end of the result string
Break-down of Example (4)
…
# this is the main program
{ print capitalise($0) }
This is a comment
in awk
Call the capitalise function with
the entire input record. Print
the result.
Output from Example
• Given the input file:
In theory there is no difference between theory and
practice, but in practice there is
• …our Capitalise function will output:
In Theory There Is No Difference Between Theory And
Practice, But In Practice There Is