13.1
Wrapping up
13.2
Running Other Programs
13.3
Running programs from a script
You may run programs using the system function:
$exitValue = system("blastall.exe ...");
if ($exitValue!=0) {die "blast failed!";}
This way the output of blast will be seen on the screen.
You can use ' > ' to redirect the output to a file:
$exitValue = system("blastall.exe ... > out.blast");
If you want to capture the output use “back-ticks” (left of the “1” key on
your keyboard):
@blastOutput = `blastall.exe ...`;
In this case the output of blast is stored in the array.
13.4
Dealing with less common formats
e.g. Rate4Site: Still not very widely used (174 citations so far…) so there is
no BioPerl modules that will run it for you and read its output:
POS SEQ SCORE
QQ-INTERVAL
STD
MSA DATA#
The alpha parameter 1.5#
K
-0.9763 [-1.6621,-0.5750] 0.8777
6/6
1
V
0.9820
[-0.1107,2.2169] 1.5983
6/6
2
F
0.0035
[-0.9640,0.4935] 1.3195
6/6
3
S
0.2010
[-0.7766,0.8962] 1.3975
6/6
4
K
-0.3480 [-1.1423,0.1673] 1.0990
6/6
5
C
-0.7887 [-1.4855,-0.3560] 1.0182
6/6
6
E
-0.9894 [-1.6621,-0.5750] 0.8714
6/6
7
L
0.0153
[-0.9640,0.4935] 1.3378
6/6
8
A
-1.1347 [-1.6621,-0.7766] 0.7487
6/6
9
H
-0.3200 [-1.1423,0.1673] 1.1252
6/6
10
K
-0.3557 [-1.1423,0.1673] 1.1077
6/6
11
L
-0.8331 [-1.4855,-0.3560] 0.9965
6/6
12
K
-0.9763 [-1.6621,-0.5750] 0.8777
6/6
13
A
1.6809
[0.4935,2.2169] 1.6672
6/6
14
Q
1.4315
[0.1673,2.2169] 1.7297
6/6
15
E
0.1025
[-0.9640,0.8962] 1.3784
6/6
16
M
0.5006
[-0.5750,1.4226] 1.4456
6/6
17
13.5
Running a local blast
1. You could install blast on your computer from:
ftp.ncbi.nlm.nih.gov
(There go to the directory: blast/executables/release/)
But this may be difficult, and you will also need to download and install the
databases you want to search.
2. You can also work on the Unix servers of the bioinformatics unit you can
use local blast that is already installed there.
Genbank databases that are installed there can be used for blast and for any
other work, such as getting a sequence by its accession.
13.6
Some Final Notes
13.7
BioPerl issues
• If you are having problems installing BioPerl on your
computer, you might need to add a repository to PPM.
We've added instructions to the course webpage.
• BioPerl warnings about:
Subroutine ... redefined at ...
Should not trouble you too much.
13.8
Referencing - Dereferencing
Referencing array :
Dereferencing array :
$gradesRef = \@grades;
$arrayRef = [85,91,67];
@arr = @{$arrRef};
$element1 = $arrRef->[0];
Referencing hash :
Dereferencing hash :
$phoneBookRef = \%phoneBook;
$hashRef =
{"pupko"=>7693,
"lab" =>9245 };
%hash = %{$hashRef};
$myVal = $hashRef->{"myKey"};
$gradesRef
$phoneBookRef
@grades
%phoneBook
$arrRef
$hashRef
=>
=>
=>
=>
=>
=>
13.9
Really final notes
• Ex. 6 has been updated (some clarification added).
• If and when you use PERL in your after-course life, we
will be glad to help you along the way.
13.10
The Exam
13.11
• The exam will take place on 14/07/2009 at 09:00 – 12:00.
• Material of the exam: everything.
• The exam will be on the computers in computer classrooms all
around the campus (arrive early!! to find your class).
Entry to the class will be on 08:50.
• The computers will be disconnected from the network (i.e. no
internet access. Sorry… ).
• There will be files waiting on the computer which you will use
during the exam, and the exam questionnaire on paper.
• You will write your solutions as normal Perl scripts.
• At the end of the exam we will collect your scripts.
13.12
• The following software will be on the computers:
– ActivePerl
– Perl Express
– Regex Coach
– Eclipse – If you want to have Eclipse let us know before
the exam by email!
• You are allowed to bring 2 (double sided) A4 pages.
• You may use the Perl documentation in Perl Express (Press the
button with the purple book on the left panel, or in the menu:
Help Perl Documentation)
13.13
Perl function documentation in PerlExpress
13.14
Exam questions from 2008
2. (10pts) Write a script in the file named "exam2.pl" that asks
the user to enter a number n and then a list of numbers,
separated by spaces. Print every n’th number.
For example: If the user enters “3” and “1 2 3 4 5 6 7 8 9 10”
the script should print "3 6 9".
13.15
Exam questions from 2008
3. (20pts) Write regular expressions to match the following
patterns below. Save only the regular expressions in the file
"exam3.txt" (don’t write a script there)
b. Find this motif in a protein sequence: G, followed by
any 5 amino acids, followed by one or two A's, followed
by T or S. Match both upper-case and lower-case letters.
c. Match a reference line in a GenBank file, such as this
example:
RL EMBO J. 10:2879-2887(1991).
It must start with "RL", followed by the abbreviation of
the journal name, followed by the volume number,
followed by the pages, followed by a 4-digit year. Your
expression should extract the two page numbers
separately – "2879" and "2887" in the above example.
13.16
Exam questions from 2008
5. (20 pts) The file “exam5.pl” contains a script that reads a blast output
file (see for example “blast.txt”). The script should find all hit lines,
such as this one:
>ref|NT_039586.4 Mus musculus chromosome 13, strain C57BL
The hit lines always start with ">". Also, blast was run against the
mouse genome so all hit lines include "chromosome i" (i is the
chromosome number). Note that this script should ignore hits to the
X & Y chromosomes!
a. The script should store the count of the number of hits for every
chromosome in the array @chroms, using the chromosome
number as the index in the array (cell number 0 will not be used).
So, for example, if there were 2 hits to chromosome 1 and 5 hits
to chromosome 2, than the array should be: (0,2,5)
Fill in the missing lines 21-22 and 28-29. Save this version of the
script as “exam5a.pl”
© Copyright 2026 Paperzz