Using zip() to structure and flatten sequences
The zip() function interleaves values from several iterators or sequences. It will create n tuples from the values in each of the n input iterables or sequences. We used it in the previous section to interleave data points from two sets of samples, creating two-tuples.
The following is an example of code that shows what the zip() function does:
>>> xi= [1.47, 1.50, 1.52, 1.55, 1.57, 1.60, 1.63, 1.65,
... 1.68, 1.70, 1.73, 1.75, 1.78, 1.80, 1.83,] >>> yi= [52.21, 53.12, 54.48, 55.84, 57.20, 58.57, 59.93, 61.29,
... 63.11, 64.47, 66.28, 68.10, 69.92, 72.19, 74.46,] >>> zip( xi, yi ) <zip object at 0x101d62ab8> >>> list(zip( xi, yi )) [(1.47, 52.21), (1.5, 53.12), (1.52, 54.48),
(1.55, 55.84), (1.57, 57.2), (1.6, 58.57),
(1.63, 59.93), (1.65, 61.29), (1.68, 63.11),
(1.7, 64.47), (1.73, 66.28), (1.75, 68.1),
(1.78, 69.92), (1.8, 72.19), (1.83, 74.46)]
There are a number of edge cases for the zip() function. We must ask the following questions about its behavior:
- What happens where then are no arguments at all?
- What happens where there's only one argument?
- What happens when the sequences are different lengths?
As with other functions, such as any(), all(), len(), and sum(), we want an identity value as a result when applying the reduction to an empty sequence. For example, sum(()) should be zero. This concept tells us what the identity value for zip() should be.
Clearly, each of these edge cases must produce some kind of iterable output. Here are some examples of code that clarify the behaviors. First, the empty argument list:
>>> zip() <zip object at 0x101d62ab8> >>> list(_) []
We can see that the zip() function with no arguments is a generator function, but there won't be any items. This fits the requirement that the output is iterable.
Next, we'll try a single iterable:
>>> zip( (1,2,3) ) <zip object at 0x101d62ab8> >>> list(_) [(1,), (2,), (3,)]
In this case, the zip() function emitted one tuple from each input value. This too makes considerable sense.
Finally, we'll look at the different-length list approach used by the zip() function:
>>> list(zip((1, 2, 3), ('a', 'b'))) [(1, 'a'), (2, 'b')]
This result is debatable. Why truncate? Why not pad the shorter list with None values? This alternate definition of the zip() function is available in the itertools module as the zip_longest() function. We'll look at this in Chapter 8, The Itertools Module.