Class 19 - Notes
Upcoming Schedule
Project 4 is due Wednesday, 16 March.
To be eligible to take the Blue Belt test on Monday, 21 March, you must have (1) completed the Green Belt, and (2) submitted a satisfactory Project 4. (If you are not eligible to take the first Blue Belt test opportunity, you will have more opportunities for Blue Belt promotion later.)
The Blue Belt test will be during the class period on Monday, 21 March. Students who are not eligible to take the Blue Belt test will have a review class and help opportunity in a different location. (We'll figure out the locations after determining how many students are ready to take the Blue Best test.)
To pass the Blue Belt level you should demonstrate:
-
Ability to write and reason about functions that use conditional and looping statements.
-
Ability to write and reason about functions that are defined recursively.
-
Understand the list data type, and how to write and reason about functions that operator on lists (including lists of lists, and deeply nested lists).
-
Ability to analyze the cost of a function and basic understanding of how computing machines work.
Being promoted to a Blue Belt ensures at least a B- as the final grade in the course (with exceptions for extreme cases of misbehaior such as not attempting to do anything else for the rest of the semester or cheating).
Slides
Asymptotic Notation
O (Big-Oh): upper bound:
A function g is in O(f) iff there are positive constants c and n0 such that
g(n) ≤ c__f(n)
for all n ≥ n0.
Ω: lower bound:
A function g is in Ω(f) iff there are positive constants c and n0 such that
g(n) ≥ cf(n) for all n ≥ n0.
Θ: tight bound: A function g is in Θ(f) iff g is in O(f) and g is in Ω(f).
Draw a diagram showing: n3, O(n), O(n2), Ω(n2), Θ(n2), and O(1).
Why are these operators useful for understanding and describing the cost of executing programs?
If you want to know a cryptosystem is hard to break, what do you want to know about it (in terms of O or Ω)?
If you want to know your mega-fractal L-System drawing will finish in a reasonable time, what do you want to know about that program (in terms of O or Ω)?
Prove f(n) = n is in O(n2):
Prove f(n) = n is not in Ω(n2):
Prove f(n) = n is not in Θ(n2):
Prove f(n) = 7n + 13 is in Θ(n):
Cost of Computing
To reason about the cost of executing a program, we need to have a model of computation that makes it clear what we are actually measuring. Later in the class we'll look at the Turing machine model which provides a very precise and clear model of computation where each step is well defined and we can measure cost of computation as the number of steps to complete execution.
With Python code, we'll settle for a simpler, more intuitive model, that corresponds fairly closely (but not exactly) to the cost of running a program in the Python interpreter.
The assumption is that each individual statement or expression evaluation takes constant time. This does not include the time for any functions called inside an evaluation. So, we assume these operations are constant time (that is, we could count each of these as one unit of cost, and that cost does not scale with the size of the inputs or length of program):
Assignment statements. The cost of a = expression
is just the cost
of evaluating the expression plus a small constant for moving the
value into the location of a
.
Arithmetic and comparison operations on bounded values. If at least
one of the inputs to an operator (e.g., +
, *
, ==
, <=
, etc.)
has a bounded value (is known to be less than some fixed value), the
cost of computing the operation is constant.
List indexing and slicing. We will assume the cost of all list
operations is constant. This is most definitely not true, but
estimating an accurate cost gets into how the underlying list
abstraction is implemented. We'll keep things simple by assuming the
cost of both indexing (a[i]
) is constant (this is probably nearly
true), and slicing (a[1:]
) is also constant (this is false in
reality, but there are ways to implement lists that would make it
true).
Calling and returning from functions. The cost to call a function or return from a function is constant. Note that this does not include the cost of actually executing the function, just the cost of calling it.
Control flow. The cost of an if
, while
, or for
statement is
the cost of executing its components (and only constant additional
cost).
From these assumptions, we can reason about the cost of any Python program (at least those you'll see at this point of the course; additional assumptions may be needed for some later programs).
Loops
The cost of executing a loop is the average cost of each iteration times the number of iterations. For example,
sum = 0 for i in range(n): sum = sum + 1
The loop executes n
times, and each iteration does a constant amount
of work (the +
operation, where the value of one of the inputs is
bounded). So, the total work of the loop is in Θ(n) where n
is the value of n
.
Nested Loops
sum = 0 for i in range(n): for j in range(n): sum = sum + 1 sum = sum * 2
Here, the outer loop executes n
times, so we need to figure out the
cost of each iteration. Its body includes a constant time *
operation, but also a loop that executes n
time. The cost of the for
j in range(n): sum = sum + 1
loop is in Θ(n) where n is the
value of n
, since it executes n iterations, each of which does
constant work. Hence, the total cost is n * Θ(n) since there
are n iterations of the i
loop, each of which has cost in
Θ(n). This simplifies to a total cost for the loop in
Θ(n2).
Recursive Functions
Analyzing the cost of a recusive function is similar to a loop: we need to figure out the cost of each function execution, and multiply that by the number of executions.
For example,
def sum_list(lst): if len(lst) == 0: return 0 else: return lst[0] + sum_list(lst[1:])
Recall our assumption that the list indexing and slicing operations have
constant time, and assuming all of the elements in the list have bounded
values, so does the +
operation. Hence, the cost of executing the
function body (not counting the recursive call) is constant: O(1).
The number of recursive calls is the number of elements in the list:
each call shortens the length of the input list by one element, so we
need len(lst)
calls to reach the base case.
Hence, there are n executions of a constant-time function body, so the
total cost is in n * O(1) which simplifes to Theta;(n) where n is
the length of the input lst
.