How to speed up code in Python

онлайн тренажер по питону
Online Python Trainer for Beginners

Learn Python easily without overwhelming theory. Solve practical tasks with automatic checking, get hints in Russian, and write code directly in your browser — no installation required.

Start Course

Introduction to Python Optimization

Python has gained popularity due to its simple syntax and rich ecosystem of libraries. However, when working with large amounts of data, complex mathematical calculations, or high-load applications, performance issues arise. This guide will cover proven ways to speed up Python code, from basic techniques to advanced solutions, ensuring top-notch SEO optimization along the way.

Main Reasons for Slow Python Performance

Architectural Features of the Language

  • Interpreted Nature of the Language: Python is executed line by line by an interpreter, which adds overhead compared to compiled languages.
  • Dynamic Typing: Type checking occurs during runtime, requiring additional computational resources.
  • Global Interpreter Lock (GIL): A locking mechanism that limits true multithreading in CPython.

Non-Optimal Algorithms

Often the main reason for poor performance, which can be addressed by choosing the right data structures and algorithms.

Practical Methods for Python Optimization

1. Code Performance Analysis

Before optimization, it is necessary to identify bottlenecks:

import cProfile
import pstats

# Function profiling
cProfile.run('your_function()', 'profile_stats')
p = pstats.Stats('profile_stats')
p.sort_stats('cumulative').print_stats(10)

Tools for analysis:

  • cProfile - built-in profiler
  • line_profiler - line-by-line analysis
  • memory_profiler - memory usage monitoring

2. Effective Use of Built-in Functions

Python built-in functions are implemented in C and run much faster than user code:

# Slow option
total = 0
for i in range(1000000):
    total += i

# Fast option
total = sum(range(1000000))

# Find the maximum element
# Slowly
max_val = numbers[0]
for num in numbers[1:]:
    if num > max_val:
        max_val = num

# Quickly
max_val = max(numbers)

3. Optimizing Data Handling

  • Using Generators to Save Memory:
# Creating a list (a lot of memory)
squares_list = [x**2 for x in range(1000000)]

# Generator (memory saving)
squares_gen = (x**2 for x in range(1000000))

# Generator function
def fibonacci_generator():
    a, b = 0, 1
    while True:
        yield a
        a, b = b, a + b
  • Choosing the Right Data Structures:
# Use set to search for items
large_list = list(range(100000))
large_set = set(range(100000))

# Search in list O(n)
if 99999 in large_list:
    pass

# Search in set O(1)
if 99999 in large_set:
    pass

# Use Counter to count items
from collections import Counter
data = ['a', 'b', 'a', 'c', 'b', 'a']
counter = Counter(data)

4. Vectorization with NumPy

NumPy provides high-performance array operations:

import numpy as np

# Regular Python
def python_sum_squares(arr):
    return sum(x**2 for x in arr)

# NumPy Vectorization
def numpy_sum_squares(arr):
    np_arr = np.array(arr)
    return np.sum(np_arr**2)

# Example Use
data = list(range(1000000))
# NumPy version will be 10-50 times faster

5. Optimizing Loops

Avoid unnecessary operations inside loops:

# Inefficiently
items = ['apple', 'banana', 'cherry']
for i in range(len(items)):
    print(f"Item {i}: {items[i]}")

# Effectively
for i, item in enumerate(items):
    print(f"Item {i}: {item}")

# Taking out invariant calculations
# Inefficiently
for item in items:
    result = expensive_function()  # Called every iteration
    process(item, result)

# Effectively
result = expensive_function()  # Called once
for item in items:
    process(item, result)

6. Multiprocessing and Multithreading

  • Using multiprocessing for CPU-intensive tasks:
from multiprocessing import Pool, cpu_count
import time

def cpu_intensive_task(n):
    return sum(i*i for i in range(n))

# Sequential execution
def sequential_processing(tasks):
    return [cpu_intensive_task(task) for task in tasks]

# Parallel execution
def parallel_processing(tasks):
    with Pool(processes=cpu_count()) as pool:
        return pool.map(cpu_intensive_task, tasks)

# Example Use
tasks = [100000] * 8
start = time.time()
sequential_result = sequential_processing(tasks)
sequential_time = time.time() - start

start = time.time()
parallel_result = parallel_processing(tasks)
parallel_time = time.time() - start
  • Threading for I/O operations:
import threading
import requests
from concurrent.futures import ThreadPoolExecutor

def fetch_url(url):
    response = requests.get(url)
    return response.status_code

urls = ['http://example.com'] * 10

# Sequentially
results = [fetch_url(url) for url in urls]

# In parallel
with ThreadPoolExecutor(max_workers=5) as executor:
    results = list(executor.map(fetch_url, urls))

7. Calculation Caching

Using functools.lru_cache:

from functools import lru_cache

@lru_cache(maxsize=None)
def fibonacci(n):
    if n < 2:
        return n
    return fibonacci(n-1) + fibonacci(n-2)

# Caching HTTP request results
@lru_cache(maxsize=128)
def get_api_data(endpoint):
    response = requests.get(endpoint)
    return response.json()

8. Asynchronous Programming

For I/O-intensive applications, use asyncio:

import asyncio
import aiohttp

async def fetch_data(session, url):
    async with session.get(url) as response:
        return await response.text()

async def main():
    urls = ['http://example.com'] * 10
    
    async with aiohttp.ClientSession() as session:
        tasks = [fetch_data(session, url) for url in urls]
        results = await asyncio.gather(*tasks)
    
    return results

# Launch an asynchronous function
results = asyncio.run(main())

9. Compilation and Alternative Interpreters

  • PyPy for automatic acceleration:

PyPy can speed up code execution by 2-10 times without changes:

# PyPy Installation
pip install pypy3

# Run the script
pypy3 your_script.py
  • Numba for JIT compilation:
from numba import jit
import numpy as np

@jit
def matrix_multiply(A, B):
    return np.dot(A, B)

# The first call compiles the function
# Subsequent calls work at C speed

10. Memory Optimization

Using slots to save memory:

class RegularClass:
    def __init__(self, x, y):
        self.x = x
        self.y = y

class OptimizedClass:
    __slots__ = ['x', 'y']
    
    def __init__(self, x, y):
        self.x = x
        self.y = y

# OptimizedClass uses 40-50% less memory

Practical Examples of Optimization

Example 1: Processing Large Files

# Inefficiently - loads the entire file into memory
def process_file_bad(filename):
    with open(filename, 'r') as f:
        lines = f.readlines()
    
    processed = []
    for line in lines:
        processed.append(line.strip().upper())
    
    return processed

# Effectively - processes line by line
def process_file_good(filename):
    with open(filename, 'r') as f:
        for line in f:
            yield line.strip().upper()

Example 2: Filtering and Transforming Data

# Slowly
def process_numbers_slow(numbers):
    result = []
    for num in numbers:
        if num % 2 == 0:
            result.append(num ** 2)
    return result

# Quickly
def process_numbers_fast(numbers):
    return [num ** 2 for num in numbers if num % 2 == 0]

# Even faster with NumPy
import numpy as np

def process_numbers_numpy(numbers):
    arr = np.array(numbers)
    even_mask = arr % 2 == 0
    return arr[even_mask] ** 2

Example 3: Working with API

import asyncio
import aiohttp
import requests

# Synchronous option
def fetch_urls_sync(urls):
    results = []
    for url in urls:
        response = requests.get(url)
        results.append(response.json())
    return results

# Asynchronous option
async def fetch_urls_async(urls):
    async with aiohttp.ClientSession() as session:
        tasks = []
        for url in urls:
            task = asyncio.create_task(fetch_url(session, url))
            tasks.append(task)
        
        results = await asyncio.gather(*tasks)
        return results

async def fetch_url(session, url):
    async with session.get(url) as response:
        return await response.json()

Performance Monitoring Tools

Code Profiling

import cProfile
import pstats
import time

def profile_function(func, *args, **kwargs):
    pr = cProfile.Profile()
    pr.enable()
    
    result = func(*args, **kwargs)
    
    pr.disable()
    
    stats = pstats.Stats(pr)
    stats.sort_stats('cumulative')
    stats.print_stats(10)
    
    return result

Measuring Execution Time

import time
from functools import wraps

def timing_decorator(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        print(f"{func.__name__} took {end - start:.4f} seconds")
        return result
    return wrapper

@timing_decorator
def slow_function():
    time.sleep(1)
    return "Done"

Recommendations for Choosing an Optimization Approach

For Numerical Calculations

  • NumPy - for arrays and matrix operations
  • Numba - for JIT compilation of mathematical functions
  • Cython - for critical sections of code

For Working with Data

  • Pandas - for analyzing and processing structured data
  • Dask - for parallel computations with large data
  • Polars - a fast alternative to Pandas

For Network Applications

  • asyncio - for asynchronous programming
  • aiohttp - for HTTP clients and servers
  • uvloop - a fast event loop for asyncio

For Parallel Calculations

  • multiprocessing - for CPU-intensive tasks
  • threading - for I/O-intensive tasks
  • concurrent.futures - a convenient interface for parallelism

Common Optimization Mistakes

Premature Optimization

Optimize only after identifying real bottlenecks through profiling.

Ignoring the Complexity of Algorithms

No optimization will help if an inefficient algorithm O(n²) is used instead of O(n log n).

Optimizing Non-Critical Sections

Focus on the code that is executed most often or takes the most time.

Sacrificing Readability for Speed

Code should remain understandable and maintainable.

Conclusion

Optimizing Python code requires a comprehensive approach: from choosing the right algorithms to using specialized libraries. Start with profiling to identify bottlenecks, use built-in functions and libraries, consider vectorization with NumPy for numerical tasks, and asynchrony for I/O operations.

Remember that the best optimization is the correct algorithm. Only then apply technical acceleration techniques. Regularly measure performance and do not forget about the balance between execution speed and code readability.

News