Project 2

Problem 1

Write a function which takes two lists which "pairs up" the elements of the two lists. For example, if I have lists A = [1, 2, 3] and B = ['x', 'y', 'z'], I should get a new list [(1, 'x'), (2, 'y'), (3, 'z')]. If the lists have different lengths, then just pair elements as far as the shorter list.

This operation is commonly known as zipping two lists together.

Bonus: Using this and array slicing, can you reimplement the earlier exercise on pairing up sequential elements in one line?

Problem 2

Suppose you have a list of numbers. A common task is to determine whether or not a number is contained in the list. Although Python allows you to do this with the in statement, let's think about how we may do this.

One way is to just step through the list element by element and check if it matches our number. The running time of this is clearly proportional to the length of the list.

Now, suppose that we knew that our list was sorted (or we asked Python to sort it). Can we do better than the previous algorithm? Yes! A completely different strategy would be what you would use in a "high-low" guessing game by repeatedly cutting the range of possible numbers in half.

The idea is the following: Suppose that we compare our number to the number in the "center" of the list. If our number is less than the center number, we can get rid of the upper half of the list and repeat the search on the lower half. Similarly, if our number if larger than the center number, we can get rid of the lower half and search the upper half of the list.

Problem 3

Write a function which takes an integer $n$ a produces a set of the prime numbers less than $n$. One of the classical methods for doing this is the Sieve of Eratosthenes.

A Summary of the Method:

The basic idea is to start with a list of length $n$ which contains True in each position. We'll go ahead and set element 0 and 1 to False since 0 and 1 are not prime.

Now, we'll start with a counter $p=2$ and mark all multiples $4,6,8,\cdots$ in the list to be False since we know that $2$ is a factor. Hence, we've eliminiated all numbers which contained a factor of $2$.

Now, we'll increment $p$ until we get to the first non-eliminated number. In this case, $3$ and we'll repeat the procedure by marking off things which have a factor of $3$.

And we'll repeated this over and over. It turns out that, we only need to go to $p \le \sqrt{n}$ before stopping.

Problem 4

Create a function which takes a "tree" of numbers. This means, it's a list whose elements are either a number or another "tree" of numbers. For example,

T = [1, 2, [3, 4, 5], 6, [7, [8, 9], 10, [11, [12]], 13], 14]

Write a function which flattens a "tree" down to a single list. For example, in this case, we'd end up with

flatten(T) == [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]

For this problem, you'll need to be able to tell the difference between different data types. In Python, we can do this using the isinstance function. For example, isinstance(3, int) == True. Or, isinstance([2, 3, 4], list) == True. But, isinstance("hello", float) == False.

Problem 5

Now suppose you wanted to reorganize your list of birthdays (from the notes on data structures) as follows. Instead of keeping track of everything as a list of names and birthdays, you want to convert this into a lookup table which takes a date and returns a list of everyone whose birthday is on that day.

Write a program which converts such a list into a lookup table.

As an example, something like

[ ("Bill", "Apr 1"), ("Ron", "Jun 6"), ("Kat", "May 12"), ("Eric", "Jun 6"), ("Que", "Feb 10") ]

should produce a lookup table with which you can say

result["Apr 1"] = ["Bill"]

or

result["Jun 6"] = ["Ron", "Eric"]

Problem 6

Suppose you want to represent a directed graph in the following way: encode the edges as a dictionary E taking a vertex to the neighbors it goes to. You can take the vertices to be labelled by strings.

You want to find which nodes can be reached from a starting vertex. To do this, we'll use a method called depth-first search. The idea is straight-forward:

Initialize an empty set of visited vertices.
Pick a starting vertex v0 as our current vertex.
The search algorithm works as follows:
    If the current vertex is not in the visited set, then:
        Add it to the visited set.
        Run the search recursively on each of it's neighbors.

When this finishes, it will contain all the vertices reachable from our starting vertex v_0.

Implement this algorithm as a function called reach which takes arguments: E, v0, visited. To run it, you might use something like the following:

visited = set([])
v0 = 'A'
reach(E, v0, visited)
print visited # will contain visited nodes now!

Try running this on a few nodes and print out which vertices they can and cannot reach.