What is parallel and asynchronous programming in Python
Parallel and asynchronous programming in Python are powerful tools for optimizing application performance. These approaches make it possible to effectively manage the execution of tasks using various strategies for working with computing resources.
Parallel programming in Python
Parallel programming provides simultaneous execution of several tasks using multiple processes or threads. This is especially effective for CPU-intensive operations.
The multiprocessing module: creating and managing processes
The multiprocessing module provides the ability to create independent processes that can run in parallel on different processor cores.
import multiprocessing
import os
def worker():
print("Process:", os.getpid())
if __name__ == "__main__":
# Creating a process
process = multiprocessing.Process(target=worker)
# Starting the process
process.start()
# Waiting for the process to complete
process.join()
Key methods:
process.start()- initiates the execution of the process by calling the run method of the Process objectprocess.join()- blocks the main thread until the child process is completed, ensuring synchronization
Process pool for mass computing
import multiprocessing
import time
def compute_square(n):
time.sleep(0.1) # Simulation of calculations
return n * n
if __name__ == "__main__":
numbers = [1, 2, 3, 4, 5]
with multiprocessing.Pool() as pool:
results = pool.map(compute_square, numbers)
print(f"Results: {results}")
The concurrent.futures module: a high-level interface
The concurrent.futures module provides a unified interface for working with threads and processes.
import concurrent.futures
import time
def worker(name):
time.sleep(1)
return f"Execution in the {name} thread"
if __name__ == "__main__":
# Creating a thread pool
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
# Running a function in a separate thread
future = executor.submit(worker, "Thread-1")
print(future.result()) # Getting the result of execution
Threading module: multithreading
The threading module is ideal for I/O operations and tasks that require parallel execution within a single process.
import threading
import time
def worker(name):
print(f'Getting started {name}')
time.sleep(2)
print(f'Shutdown {name}')
# Creating and launching streams
threads = []
for i in range(3):
t = threading.Thread(target=worker, args=(f'Thread-{i}',))
threads.append(t)
t.start()
# Waiting for all threads to finish
for t in threads:
t.join()
Asynchronous programming in Python
Asynchronous programming allows you to perform tasks in a non-blocking mode, which is especially effective for I/O operations and network interaction.
Basics of asyncio
The asyncio module provides an infrastructure for writing asynchronous code using coroutines.
import asyncio
import time
async def worker(name, delay):
print(f'Getting started {name}')
await asyncio.sleep(delay) # Non-blocking pause
print(f'Shutdown {name}')
return f'Result from {name}'
async def main():
# Parallel execution of asynchronous tasks
tasks = [
worker("Task-1", 2),
worker("Task-2", 1),
worker("Task-3", 3)
]
results = await asyncio.gather(*tasks)
print(f"All results: {results}")
if __name__ == "__main__":
asyncio.run(main())
Asynchronous network operation: aiohttp
The aiohttp library provides high-performance HTTP requests in asynchronous mode.
import aiohttp
import asyncio
import time
async def fetch_data(session, url):
try:
async with session.get(url) as response:
return await response.text()
except Exception as e:
return f"Error: {str(e)}"
async def main():
urls = [
"https://httpbin.org/delay/1",
"https://httpbin.org/delay/2",
"https://httpbin.org/delay/1"
]
async with aiohttp.ClientSession() as session:
start_time = time.time()
results = await asyncio.gather(*[fetch_data(session, url) for url in urls])
end_time = time.time()
print(f"Completed in {end_time - start_time:.2f} seconds")
print(f"Received {len(results)} answers")
if __name__ == "__main__":
asyncio.run(main())
Working with asynchronous generators
import asyncio
async def async_generator():
for i in range(5):
await asyncio.sleep(0.5)
yield i
async def main():
async for value in async_generator():
print(f"Value received: {value}")
if __name__ == "__main__":
asyncio.run(main())
Comparing approaches: when to use each
parameter| The | Parallel programming | Asynchronous programming |
|---|---|---|
| Execution model | Multithreading or multiprocessing | Single-threading with event loops |
| Main goal | Increased computing performance | Increased responsiveness and efficiency of I/O |
| CPU usage | High core load distribution | Low, efficient usage of waiting time |
| Typical tasks | Mathematical calculations, data processing | Network requests, working with files, databases |
| Debugging difficulty | High due to race condition | Moderate, more predictable behavior |
| Memory | High consumption (separate processes) | Low consumption (single process) |
| Basic tools | threading, multiprocessing, concurrent.futures | asyncio, aiohttp, aiofiles |
Practical recommendations
Use parallel programming when:
- Performing CPU-intensive calculations
- Processing large amounts of data
- You can divide the task into independent parts
- Have a multi-core system
Use asynchronous programming when:
- Working with network requests
- Performing input/output operations
- We need high responsiveness of the application
- Handling multiple simultaneous connections
Hybrid approach
Modern applications often combine both approaches.:
import asyncio
import concurrent.futures
import time
def cpu_intensive_task(n):
# Simulating a CPU-intensive task
result = sum(i * i for i in range(n))
return result
async def main():
loop = asyncio.get_event_loop()
# Performing CPU-intensive tasks in a separate process
with concurrent.futures.ProcessPoolExecutor() as executor:
tasks = [
loop.run_in_executor(executor, cpu_intensive_task, 100000),
loop.run_in_executor(executor, cpu_intensive_task, 200000),
loop.run_in_executor(executor, cpu_intensive_task, 150000)
]
results = await asyncio.gather(*tasks)
print(f"Results: {results}")
if __name__ == "__main__":
asyncio.run(main())
The right choice between parallel and asynchronous programming depends on the specifics of the tasks and performance requirements of your application. Understanding the specifics of each approach will help you create more efficient and scalable solutions.