Iterators and Iterable

Iterators play a huge role in Java, as we saw in IC211. At its most basic, the Iterator interface gives us "hasNext()" and "next()". An object implements the "Iterable" interface if it has a method "iterator()" that returns an Iterator for the object.

Python has the same idea, except the details are a little bit different. First of all, a Python interface is not a formally defined language construct like in Java. Instead, an interface is just a list of function names you need to define (perhaps with some guidelines on behavior). So, to meet the "Iterator" interface, an object needs to (1) define a method __next__(self) that returns the next object in the iteration, and (2) raises a StopIteration exception if there's nothing left. Meanwhile, to meet the "Iterable" interface, an object needs to define a method __iter__(self) that returns an Iterator for the object.

However, the magic is that Python is built on Iterables/Iterators. The Python for loop
for i in X:
works as long as X implements the Iterable interface. In fact, the same is true of list comprehensions. So, for example,
[i**2 for i in X]
works as long as X implements the Iterable interface. So our same code can be used like this:

TODO
  1. Take the above code and modify it so there is a lower bound (inclusive) for the countdown. So, for example, we should get the following:
    If you want to still be able to write "Countdown(8)" and have it work just like "Countdown(8,1)", there's a nice way to do it using "default parameter values". Here's an example: if I define function foo as
    def foo(x,y=1):
        return x + y 
    ... then foo(5,3) produces 8, but foo(5) is still legal, because we have defined the second parameter, y, with a default value of 1, so if there is only one argument, x gets that argument and y gets the value 1. So foo(5) gives 6.
    >>> [ x**2 for x in Countdown(8,5)]
    [64, 49, 36, 25]	  
    >>> for i in Countdown(5,1):
    ...     print(f"({i})",end="")
    ... 
    (5)(4)(3)(2)(1)  
  2. Write an Iterator class named "Fiberator" and an Iterable class named "Fibseq" following the previous examples. However, the trick now is that Fibseq(z) will be an object which, when you call its __iter__() method gives an iterator over the first z elements of the Fibonacci sequence. So, for example,
    $ python3 -i ex1.py
    >>> [f for f in Fibseq(4)]
    [1, 1, 2, 3]
    >>> [f for f in Fibseq(10)]
    [1, 1, 2, 3, 5, 8, 13, 21, 34, 55]
    
    Recall that the Fibanocci sequence is defined by $f(n) = f(n-1) + f(n-2)$, and $f(0) = f(1) = 1$. Sow now the Fiberator has to remember how many terms it has printed out and is supposed to print out, as well as the last two so it can compute the next value when it is requested.
    Hint: you will probably want your iterator to have fields nextval, prevval, count and N, and "nextval = nextval + prevval" will have to appear somewhere! Also: make prevval=0 initially.

Iterables vs lists

An Iterable is a thing that can give you an Iterator, which in turn will produce successive objects as you ask for them. This is different than a list, which has the objects already produced and all sitting in memory somewhere. Let's explore that a bit.

First of all, you can always create a list from an Iterable simply by calling the list constructor. Like this:

>>> list(Countdown(5))
[5, 4, 3, 2, 1]
To see the difference between an Iterable and a list, just try
	  c = Countdown(99999999) # this takes no time
	  L = list(c) # this takes some time!
We see this with range() all the time. We get an Iterable from range, not an actual list. Try this:
	  r = range(0,100)
	  r # what do you see?
	  L = list(range(0,100))
	  L # what do you see?

TODO
  1. Write a function first2 that takes an Iterable and uses __iter__() to get an Iterator, and __next__() on the Iterator to fetch the first two values of the Iterable and return them as tuple.
    >>> c = Countdown(99)
    >>> first2(c)
    (99, 98)    
    >>> r = range(10,20)
    >>> first2(r)
    (10, 11)
    
    >>> first2(["red","fish","blue","fish"])
    ('red', 'fish')
    
    >>> first2(["ha"])
    Exception: Not enough elements
    
  2. If the name of your Python file is sol1.py, try this:
    >>> f = open("sol1.py","r")
    >>> list(f)
    What do you get? What happens if you do list(f) again? Do you see what's going on here? Now try:
    >>> f = open("sol1.py","r")
    >>> first2(f)
    >>> first2(f)
    Compare that to L = ["one","fish","two"] and evaluating first2(L) twice in a row.

    Important! the object returned by file("filename","r") implements the Iterable interface, where the Iterator you get returns a line from the file with each call to __next__(). This allows you to for-loop over the lines of a file or do a list comprehension over the lines of the file. Very handy!

zip: Iterables from Iterables

The function zip is really interesting. It takes two Iterables and creates pairs from them, like this:
>>> z = zip(range(1,4),["alpha","beta","gamma"])
>>> z
<zip object at 0x740f6771ef00>
>>> list(z)
[(1, 'alpha'), (2, 'beta'), (3, 'gamma')]
Important! The zip function returns a zip object, which is an Iterable. That means it doesn't pair anything up when you call zip, it only pairs things up as the calls to __next__() on the iterator requires it.

Note: zip stops as soon as either iterator runs out of stuff.

  1. Given the definitions:
    A = ['u','b','w','p','k','z']
    B = [5,9,2,7,1,6]
    ... use zip to write a simple map comprehension that maps the letter in A to the number in B. Note: do not use range() or len(), that would be cheating!
  2. Download the file in1.txt. Our goal is to create a list containing the content of lines 2, 5 and 6, and to do so somewhat efficiently, meaning that we shouldn't read more lines of the file (which in theory could be long) than we need to. I want you to write a single line of Python code that does this! So, for the example file in1.txt, your line should produce:
    ['Arthur Dent', 'IL', 'Office Manager']
    ... though, of course, it should work for any file formatted this same way, so you cannot hardcode 'Arthur Dent', 'IL' or 'Office Manager'.

    • You might want to start off by using zip (perhaps in combination with range ...) to get a list of the first X lines, for whatever X you deem appropriate.
    • Remember that list comprehensions allow for an "if clause" that filters which items you want.
    • Strings have a method .strip() that removes leading and trailing whitespace (including newlines) that might help get everything just so.


Christopher W Brown