Advanced⏱ 25 min

Generators

Generators produce values one at a time — on demand — instead of building the whole collection in memory. Essential for working with large files, streams, and infinite sequences.

The Problem: Memory

If you want to process a million items, building a list of all million items first wastes memory. A generator produces each item only when asked.

python
import sys

# List — builds ALL items in memory immediately
big_list = [x**2 for x in range(1_000_000)]
print(f"List size:      {sys.getsizeof(big_list):,} bytes")

# Generator — builds nothing upfront, produces one item at a time
big_gen = (x**2 for x in range(1_000_000))
print(f"Generator size: {sys.getsizeof(big_gen):,} bytes")

# Both produce the same values when iterated
# But the generator uses ~112 bytes vs ~8MB for the list!

# You can still use a generator in for loops and sum()
total = sum(x**2 for x in range(100))
print(f"Sum of squares 0-99: {total}")
Output

Generator Functions with yield

A generator function uses yield instead of return. Each time you call next() on it, execution resumes right after the last yield.

python
def countdown(n):
    print("Starting countdown!")
    while n > 0:
        yield n         # pause here, hand back n
        n -= 1
    print("Done!")

gen = countdown(3)
print(next(gen))   # prints "Starting countdown!" then yields 3
print(next(gen))   # resumes, yields 2
print(next(gen))   # resumes, yields 1
# next(gen) would raise StopIteration

# Usually you just use it in a for loop
for value in countdown(5):
    print(value, end=" ")

# Infinite generator — perfectly fine because it's lazy
def integers_from(n):
    while True:
        yield n
        n += 1

gen = integers_from(10)
print([next(gen) for _ in range(5)])  # [10, 11, 12, 13, 14]
Output

Real-World Use Cases

python
# Fibonacci — infinite sequence using almost no memory
def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

# Take first 10
fib = fibonacci()
first_10 = [next(fib) for _ in range(10)]
print(first_10)  # [0, 1, 1, 2, 3, 5, 8, 13, 21, 34]

# Process a (simulated) large CSV line-by-line
def read_large_file(lines):
    for line in lines:
        yield line.strip()

fake_csv = ["alice,25", "bob,30", "carol,22"]
for row in read_large_file(fake_csv):
    name, age = row.split(",")
    print(f"{name} is {age} years old")

# Generator pipeline — each stage is lazy
data = range(1, 11)
doubled  = (x * 2 for x in data)
filtered = (x for x in doubled if x > 10)
result   = list(filtered)
print(result)   # [12, 14, 16, 18, 20]
Output
🎉

Lesson complete!

Next: Async Programming — concurrent I/O without threads.

🏆

Certificate Unlocked!

You completed all lessons with 70%+. View your certificate →