Using Generators
A generator is a lazy iterator. That is, it loads the content that is requested into memory on-demend, e.g., when it's access via iteration. This allows vary large files to be loaded into memory iteratively, enabling computations that would otherwise be prohibited by available RAM.
Creating a Generator
In Python, there are two ways to create generators
- Generator function via
yield
. Theyeild
keyword operates similarly toreturn
but instead yeilds a lazy iterator (generator). Is it important that you nest the call toyeild
within a loop such that the__next__
method of the generator can be calledpythondef generate_row(filename): for row in open(filename, 'r'): yeild row row_generator = generate_row(filename)
def generate_row(filename): for row in open(filename, 'r'): yeild row row_generator = generate_row(filename)
2. Generator expression. With syntax similar to list comprehension, a generator expression wraped the logic in `(...)`, yeilding a lazy iterator.
```python
row_generator = (row for row in open(filename, 'r'))
2. Generator expression. With syntax similar to list comprehension, a generator expression wraped the logic in `(...)`, yeilding a lazy iterator.
```python
row_generator = (row for row in open(filename, 'r'))
Using a Generator
Recall that a generator is a lazy iterator, so it's entries must be accessed via iteration. Fundamentally, this access is via the object's __next__
method which is implicitly called in a for
loop, but may also be called explicitly in something like a while
loop with next(generator)
.
- One-off access via
next
pythonrow0 = next(row_generator)
row0 = next(row_generator)
2. Access via `while` loop. Here we use the `next(x, default)` default parameter to emit a `None` once the generator is exhausted.
```python
while True:
row = next(row_generator, None)
if row is None:
break
2. Access via `while` loop. Here we use the `next(x, default)` default parameter to emit a `None` once the generator is exhausted.
```python
while True:
row = next(row_generator, None)
if row is None:
break
- Access via
for
loop. This is the most common and intuitive.pythonfor row in row_generator: # process row
for row in row_generator: # process row