algorithms - New Mexico Computer Science for All

CS151L Fall 2013
Week 9: Video Summary
Complexity in Computer Science
How does google maps or gps work
• Example of a shortest (time or distance or cost) problem
• Find optimum path between any two points
Graph problems
• Shortest path between two points
• Traveling salesman problem
• Network routing
• Mazes
What is a graph
• Collection (set) of nodes and edges
• Edges (links, branches) connect pairs of nodes
• Links may
Have numbers associated assigned to them (Cost, Time, Capacity)
May have directions
• Nodes may also have properties
Why not look at all paths?
• Suppose we have n nodes
• How many paths are there?
• Potentially we could go directly from a given node to any other node
If no path exists, then cost is infinite
• Hence there are n-1 possible first steps
• The for each of these n-1 nodes we can go to any of the remaining n-2 nodes
• Number of possible paths:
• (n-1)*(n-2)*(n-3)……..3*2*1
• We call this function the factorial function: (n-1)!
• How big is it?
1! = 1
2! =2, 3!=6, 4!=24, 5!=120
10! @ 3.6*106
100! @ 9.3*10157
1000!@ 4.0*102567
• How many nodes are in the GPS data base? Lots and lots
• Need a lot of Computer Science to deal with graph problems
Complexity vs. complexity science
• Need algorithms and data structures to come up with feasible solutions to real
problems
• Measure complexity of an algorithm by how much time or space it uses for a
problem of a given size
• Many real world problems are intractable in that they cannot be solved optimally
• Complexity science deals with these problems
How do we count the complexity
Document1
• Operations = time. What is an operation? Depends on problem
Example: searching and sorting operations are Compare and Swap
Note that these operations depend on the type of data
• Required memory
Big O
• We are interested in large problems and how an algorithm performs as we
increase size
• If we have an expression for the complexity, there will be a dominant term for
large size
• Ex: 10*n2+1000*n+100000
Eventually the n2 dominate
Multiplicative constants aren’t important
Expression is said to be O(n2)
• The most important algorithms are
O(1)
O(n)
O(n2)
O(n log(n))
INTRODUCTION TO ALGORITHMS
What is an algorithm?
 A set of instructions that can be used repeatedly to solve a problem or complete a
task.
Why do we use algorithms?
• Don’t have think about the details of the problem – we already know how to solve
it.
• All the information is in the algorithm – just have to follow it.
• Same algorithm can be used to sort a list whether it has 5 elements or 5000
• Anyone can follow it
Computers are problem solving machines and suited to using algorithms
Characteristics of an Algorithm
• Receives input or initial conditions
• Produces a result (output)
• Contains steps
Well-ordered
Clearly defined
Do-able
Clear stopping point or conditions
• General
Types of Algorithms sorted according to Application
• Counting – counting large groups of items
• Sorting - puts items in a certain order in a list
• Searching - finding an item in a list
• Mapping/Graphing - solving problems related to graph theory (shortest distance,
cheapest distance,…)
Document1
• Encryption – Encodes or decodes information
• Packing – how to fit the most items in a given space
• Maze – Creating or solving mazes
Developing an Algorithm
• What’s the problem?
• Start with input and output
• Abstraction and Decomposition
Break the problem into parts
Detail the steps required for each step
• Write the pseudo-code
• Refine the solution, if necessary
• Write the code
• Debug, test the code with different cases
• Your done!
Example – Taking the Average
• Input and output
Input – list of numbers ( 5, 10, 12, 14, 16)
Output – average of the list
Abstraction and decomposition – what are the steps
Add up the numbers in the list (total)
Count the numbers in the list (count)
Average = total/count
• Write the Pseudo-Code
Initialize total and count to 0
For each number in the list
add that number to the total
add 1 to count
Divide total by count
• Refine if necessary – might find improvements
• Write the code – in the language you are using
• Debug – always!
Analyzing Algorithms
• There are many ways to solve the same problem
• Analyze algorithms
Evaluate its suitability (speed, memory needed)
Compare it with other possible algorithms
Find improvements
Algorithms tend to become better during analysis- shorter, simpler, more elegant
Big O Notation
• Used in computer science when analyzing algorithms
Measure of how an algorithm runs as you increase the input size dramatically.
• Processing time
• Memory used
• Another way to classify algorithms
Common Big O values
O(n)
Document1
O(n log(n))
O(n2)
MORE ALGORITHMS
•
•
•
•
•
Algorithm: A set of instructions that can be used repeatedly to solve a problem or
complete a task.
As you problem grows, the need for a good algorithm grows
Need a systematic way to solve the problem
Two Problems
Counting
Mazes
Counting - Three counting problems
Students in your class
Students in your school
High school students in your state
Count students in a Classroom - Easy
Teacher can count
Students can count off
Counting students in a school - Harder
One possible method – “Brute Force”
Gather the students
Line them up
Count them (Have someone count the or Have them count off)
Another method – Divide and Conquer (algorithmic principle,
break the problem into smaller pieces and then solve)
•
•
Leave students in classrooms
Have each classroom count their students (ClassCount)
Add up the students from classroom (SchoolCount)
Counting High School Students in the State - Even Harder
Can’t line them up
MUST USE Divide and Conquer
Have each School count their students
• At each school
Leave students in classrooms
Have each classroom count their students
Add up the students from each classroom (ClassCount)
Have the schools report their student numbers (SchoolCount)
Add up all the School’s student numbers to get the number of high
school students in the state (StateCount)
Pseudocode for the Student Counting problems
In each classroom in each school
set ClassCount to 0
Document1
while (students left in class to count)
set ClassCount to ClassCount + 1
For the school do this too
set SchoolCount to 0
while (still classrooms left to count)
go to each classroom
ask number of student in class (ClassCount)
set SchoolCount = SchoolCount + ClassCount
For the state do this too
set StateCount to 0
while (still schools left to count)
go to each school
ask number of student in school (SchoolCount)
set StateCount = StateCount + SchoolCount
LOOPING IN NETLOGO
•
•
•
•
Already know that there are three common types of loops in most programming
languages:
Infinite Loops
Counted Loops
Conditional Loops
Components of a Loop
Loops have some properties in common:
• Body of the loop: The portion of the code that is repeated.
• Iteration: Number of times the loop is executed
Some types of loops have other properties
• Loop Variable: A variable that is incremented during the looping
process
• Loop Condition: A conditional (true of false) statement that is
checked during looping
Infinite Loops
An infinite loop repeats the body of the loop until one of the following occurs:
• A“stop” command in the program
• Manually turned off.
• A runtime error.
Already used them in NetLogo – Forever Loops
Counted Loops
Repeated a fixed maximum number of times.
Must know how many times
• Before programming
• Be able to calculate maximum number of times
Netlogo has a specific command for counted loops, the REPEAT command
repeat Nnumber
[
commands
Document1
•
]
• The commands in the square brackets [] are repeated Nnumber of
times.
Conditional Loops
Repeated while a certain conditional statement is true.
Must establish the condition statement initially and it must be true for the loop
to start
The condition must become false during the looping process in order for the
loop to stop, otherwise the loop becomes infinite.
Netlogo has a specific command for conditional loops, the WHILE command
while [condition]
[
commands
]
• The commands in the square brackets [] are repeated while the
condition is true.
• When the condition becomes false, the loop is exited.
DATA COMPRESSION
•
•
•
Run Length Coding
• Consider a long sequence of 0’s and 1’s where
0= sunny
1 = not sunny
• 0000000000000100000000000001000000
• Not much information in 0’s but 1’s are exciting
• We could argue the same way about a digitized letter that we want to fax
where 0=white and 1=black
• And rewrite as
• 13 0’s 1 13’0’s 1 6 0’s
• Or 13, 13, 6 since we know a 1 separates the 0’s
• We can encode the decimals in binary and have a much more compact
representation
• Using base 16 we get 12 bits instead of 34 bits
• Basis of fax transmission
Lossless Compression
• Morse Code
• Huffman Coding
• Basis of zip files
• TIFF images (one form)
• Dynamic
• Much more compression possible if we consider groups, i.e. “th” has a high
probability but “tq” doesn’t
• For images, most compression is frame to frame
Image Compression
• Most images contain a tremendous amount of redundancy
• Background
Document1
•
• Frame to frame
• We don’t need perfect reconstruction
• Usually we get an enormous amount of compression if we allow for small
errors which usually aren’t visible
Image Formats
• GIF
• Reduce number of colors
• Use a table of colors to reconstruct
• JPEG
• Divide image into small blocks
• Compress each block
• MPEG
• Add frame to frame compression
Document1