Class 19 - Notes

Upcoming Schedule

Project 4 is due Wednesday, 16 March.

To be eligible to take the Blue Belt test on Monday, 21 March, you must have (1) completed the Green Belt, and (2) submitted a satisfactory Project 4. (If you are not eligible to take the first Blue Belt test opportunity, you will have more opportunities for Blue Belt promotion later.)

The Blue Belt test will be during the class period on Monday, 21 March. Students who are not eligible to take the Blue Belt test will have a review class and help opportunity in a different location. (We'll figure out the locations after determining how many students are ready to take the Blue Best test.)

To pass the Blue Belt level you should demonstrate:

Being promoted to a Blue Belt ensures at least a B- as the final grade in the course (with exceptions for extreme cases of misbehaior such as not attempting to do anything else for the rest of the semester or cheating).

Slides

Asymptotic Notation

O (Big-Oh): upper bound:

A function g is in O(f) iff there are positive constants c and n0 such that

       g(n) ≤ c__f(n)

for all nn0.

Ω: lower bound:

A function g is in Ω(f) iff there are positive constants c and n0 such that

      g(n) ≥ cf(n)
for all nn0.

Θ: tight bound: A function g is in Θ(f) iff g is in O(f) and g is in Ω(f).

Draw a diagram showing: n3, O(n), O(n2), Ω(n2), Θ(n2), and O(1).

Why are these operators useful for understanding and describing the cost of executing programs?

If you want to know a cryptosystem is hard to break, what do you want to know about it (in terms of O or Ω)?

If you want to know your mega-fractal L-System drawing will finish in a reasonable time, what do you want to know about that program (in terms of O or Ω)?

Prove f(n) = n is in O(n2):

Prove f(n) = n is not in Ω(n2):

Prove f(n) = n is not in Θ(n2):

Prove f(n) = 7n + 13 is in Θ(n):

Cost of Computing

To reason about the cost of executing a program, we need to have a model of computation that makes it clear what we are actually measuring. Later in the class we'll look at the Turing machine model which provides a very precise and clear model of computation where each step is well defined and we can measure cost of computation as the number of steps to complete execution.

With Python code, we'll settle for a simpler, more intuitive model, that corresponds fairly closely (but not exactly) to the cost of running a program in the Python interpreter.

The assumption is that each individual statement or expression evaluation takes constant time. This does not include the time for any functions called inside an evaluation. So, we assume these operations are constant time (that is, we could count each of these as one unit of cost, and that cost does not scale with the size of the inputs or length of program):

Assignment statements. The cost of a = expression is just the cost of evaluating the expression plus a small constant for moving the value into the location of a.

Arithmetic and comparison operations on bounded values. If at least one of the inputs to an operator (e.g., +, *, ==, <=, etc.) has a bounded value (is known to be less than some fixed value), the cost of computing the operation is constant.

List indexing and slicing. We will assume the cost of all list operations is constant. This is most definitely not true, but estimating an accurate cost gets into how the underlying list abstraction is implemented. We'll keep things simple by assuming the cost of both indexing (a[i]) is constant (this is probably nearly true), and slicing (a[1:]) is also constant (this is false in reality, but there are ways to implement lists that would make it true).

Calling and returning from functions. The cost to call a function or return from a function is constant. Note that this does not include the cost of actually executing the function, just the cost of calling it.

Control flow. The cost of an if, while, or for statement is the cost of executing its components (and only constant additional cost).

From these assumptions, we can reason about the cost of any Python program (at least those you'll see at this point of the course; additional assumptions may be needed for some later programs).

Loops

The cost of executing a loop is the average cost of each iteration times the number of iterations. For example,

sum = 0
for i in range(n):
   sum = sum + 1

The loop executes n times, and each iteration does a constant amount of work (the + operation, where the value of one of the inputs is bounded). So, the total work of the loop is in Θ(n) where n is the value of n.

Nested Loops

sum = 0
for i in range(n):
   for j in range(n):
       sum = sum + 1
   sum = sum * 2

Here, the outer loop executes n times, so we need to figure out the cost of each iteration. Its body includes a constant time * operation, but also a loop that executes n time. The cost of the for j in range(n): sum = sum + 1 loop is in Θ(n) where n is the value of n, since it executes n iterations, each of which does constant work. Hence, the total cost is n * Θ(n) since there are n iterations of the i loop, each of which has cost in Θ(n). This simplifies to a total cost for the loop in Θ(n2).

Recursive Functions

Analyzing the cost of a recusive function is similar to a loop: we need to figure out the cost of each function execution, and multiply that by the number of executions.

For example,

def sum_list(lst):
   if len(lst) == 0:
       return 0
   else:
       return lst[0] + sum_list(lst[1:])

Recall our assumption that the list indexing and slicing operations have constant time, and assuming all of the elements in the list have bounded values, so does the + operation. Hence, the cost of executing the function body (not counting the recursive call) is constant: O(1).

The number of recursive calls is the number of elements in the list: each call shortens the length of the input list by one element, so we need len(lst) calls to reach the base case.

Hence, there are n executions of a constant-time function body, so the total cost is in n * O(1) which simplifes to Theta;(n) where n is the length of the input lst.