How generators work (yield)

онлайн тренажер по питону
Online Python Trainer for Beginners

Learn Python easily without overwhelming theory. Solve practical tasks with automatic checking, get hints in Russian, and write code directly in your browser — no installation required.

Start Course

Generators in Python: An Efficient Approach to Data Sequences

Generators in Python are a powerful mechanism for working efficiently with data sequences. They allow you to create iterable objects that compute values on demand, significantly saving system memory. The core of generators is the yield keyword, which differs significantly from the traditional return.

Basics of Python Generators

Generators are special functions that return iterators instead of concrete values. The main difference lies in the use of the yield keyword, which saves the function's state between calls.

With each call to the generator function, execution resumes from where the last yield command was stopped. This makes generators an ideal solution for processing large data streams without having to fully load them into memory.

def simple_generator():
    yield 1
    yield 2
    yield 3

gen = simple_generator()
for value in gen:
    print(value)

Execution result:

1
2
3

Mechanism of the yield Keyword

The yield keyword works on the principle of "lazy evaluation":

  • On the first call to next(), execution starts from the beginning of the function to the first yield.
  • After returning the value, the function "freezes" its state.
  • The next call to next() resumes execution from the point of the stop.
  • The process continues until the function completes.
def counter():
    print("Generator start")
    yield 1
    print("After first yield")
    yield 2
    print("After second yield")

gen = counter()
next(gen)  # Outputs: Generator start
next(gen)  # Outputs: After first yield

Advantages of Generators over Lists

Using generators instead of lists offers significant advantages when working with large amounts of data:

  • Memory Saving: Generators do not store all values at once, creating them as needed.
# List - takes up a lot of memory
squares_list = [x * x for x in range(1000000)]

# Generator - minimal memory consumption
squares_gen = (x * x for x in range(1000000))
  • Performance: Generators start returning values instantly without waiting for the full sequence to be created.

Creating Infinite Sequences

Generators allow you to create infinite sequences without the risk of memory overflow:

def infinite_counter(start=0):
    while True:
        yield start
        start += 1

counter = infinite_counter()
print(next(counter))  # 0
print(next(counter))  # 1
print(next(counter))  # 2

Methods for Working with Generators

Generators support several specialized methods:

  • next() — getting the next value from the generator
  • send(value) — passing a value inside the generator
  • throw() — generating an exception inside the generator
  • close() — forcibly stopping the generator
def interactive_generator():
    value = yield "Start"
    while True:
        value = yield f"Received: {value}"

gen = interactive_generator()
print(next(gen))        # Start
print(gen.send(42))     # Received: 42
print(gen.send("Hello")) # Received: Hello

Exception Handling in Generators

Generators support exception handling, including the special GeneratorExit exception:

def example_generator():
    try:
        yield 1
        yield 2
    except GeneratorExit:
        print("Generator closed!")

gen = example_generator()
print(next(gen))  # 1
gen.close()       # Generator closed!

Practical Examples of Use

Reading Large Files

def read_large_file(filepath):
    with open(filepath, 'r', encoding='utf-8') as file:
        for line in file:
            yield line.strip()

# Processing a file line by line
for line in read_large_file('big_data.txt'):
    process_line(line)

Data Filtering

def even_numbers(numbers):
    for number in numbers:
        if number % 2 == 0:
            yield number

result = even_numbers(range(10))
print(list(result))  # [0, 2, 4, 6, 8]

Processing API Data

def fetch_paginated_data(api_url):
    page = 1
    while True:
        response = requests.get(f"{api_url}?page={page}")
        data = response.json()
        
        if not data['items']:
            break
            
        for item in data['items']:
            yield item
            
        page += 1

Generator Expressions vs. Generator Functions

Generator expressions are created using parentheses and are suitable for simple cases:

squares = (x*x for x in range(10))

Generator functions use the yield keyword and are suitable for complex logic:

def fibonacci():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b

Performance and Optimization

Generators are especially effective in the following scenarios:

  • Processing large files (logs, CSV, JSON)
  • Working with streaming APIs
  • Mathematical calculations of sequences
  • Data processing pipelines

Comparison of return and yield

Characteristic return yield
Returns One value Iterator
Completes function Yes No
Saves state No Yes
Memory consumption Depends on the data Minimal
Reusability Requires a new call Continues from stop point

Common Mistakes When Working with Generators

  • Reuse: Generators can only be used once
  • Forgotten next(): Without calling next(), the generator will not execute the code
  • Infinite Loops: Incorrect use of infinite generators
  • Incorrect Exception Handling: Skipping StopIteration

Integration with Popular Libraries

Generators integrate well with data analysis libraries:

def data_processor():
    for chunk in read_large_dataset():
        processed_chunk = preprocess(chunk)
        yield processed_chunk

# Using with pandas
import pandas as pd
for chunk in data_processor():
    df = pd.DataFrame(chunk)
    analyze(df)

Conclusion

Generators and the yield keyword are fundamental tools for efficient programming in Python. They allow you to create high-performance applications with minimal memory consumption, especially when working with large amounts of data or streaming sources of information.

Understanding the principles of generators opens up new possibilities for optimizing code and creating more elegant solutions in the field of data processing, web development, and scientific computing.

News