What is the Requests Library and Why Do You Need It?
The requests library is one of the most popular and easy-to-use Python libraries for making HTTP requests. It provides a convenient interface for sending GET, POST, PUT, DELETE, and other types of HTTP requests with a minimal amount of code.
Key Advantages of the Requests Library
- Simple syntax and intuitive API
- Built-in session support and cookie management
- Automatic JSON and form handling
- Support for various authentication methods
- Ability to work with proxy servers
- Comprehensive documentation in English
- Active support from the developer community
Installing the Requests Library
Installing the requests library is done via the pip package manager. The installation process depends on the version of Python you are using.
Standard Installation
pip install requests
Installation for Python 3
If multiple versions of Python are installed on your system, use the pip3 command to install it into the Python 3 environment:
pip3 install requests
Verifying the Installation
After installation, you can verify the success of the procedure by importing the library into the Python console:
import requests
print(requests.__version__)
Basics of Working with GET Requests
GET requests are used to retrieve data from a web server. This is the most common type of HTTP request in web development.
Simple GET Request
import requests
response = requests.get('https://api.github.com')
print(response.status_code) # Response status (200 - OK)
print(response.text) # Response body as a string
Analyzing the Server Response
When working with GET requests, it is important to properly analyze the received response. The response object contains many useful attributes:
status_code- HTTP status of the responsetext- response content in text formatcontent- response content in bytesheaders- response headersurl- the final URL of the request
Working with Request Parameters
URL parameters can be passed through a dictionary, making the code more readable and easier to modify:
params = {
'q': 'python programming',
'page': 2,
'limit': 50
}
response = requests.get('https://www.example.com/search', params=params)
print(response.url) # Show the formed URL with parameters
POST Requests and Sending Data to the Server
POST requests are used to send data to the server, for example, when filling out forms or creating new records in the database.
Sending Form Data
data = {
'username': 'admin',
'password': '12345',
'email': 'admin@example.com'
}
response = requests.post('https://httpbin.org/post', data=data)
print(response.text)
Sending JSON Data
When working with modern APIs, it is often necessary to send data in JSON format. The requests library automatically sets the correct Content-Type when using the json parameter:
import json
data = {
'name': 'John Doe',
'age': 30,
'city': 'Moscow'
}
response = requests.post('https://httpbin.org/post', json=data)
result = response.json() # Automatic conversion of the response to JSON
print(result)
Differences Between the Data and JSON Parameters
- The
dataparameter sends data as a form (application/x-www-form-urlencoded) - The
jsonparameter sends data in JSON format (application/json)
Working with HTTP Headers
HTTP headers contain metadata about the request or response. Proper header configuration is critical when working with APIs and web scraping.
Setting Custom Headers
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
'Accept': 'application/json',
'Content-Type': 'application/json'
}
response = requests.get('https://httpbin.org/headers', headers=headers)
print(response.json())
Important Headers for Web Scraping
User-Agent- client identification (browser, bot)Accept- content types that the client can processAccept-Language- preferred languagesReferer- URL of the page from which the transition was made
Session Management and Cookies
Sessions allow you to maintain state between multiple HTTP requests. This is especially important when working with authorization and sites that require authentication.
Creating and Using a Session
session = requests.Session()
# Set cookies via the first request
session.get('https://httpbin.org/cookies/set/sessioncookie/123456789')
# Cookies are automatically sent in subsequent requests
response = session.get('https://httpbin.org/cookies')
print(response.text)
Advantages of Using Sessions
- Automatic cookie management
- Reuse of TCP connections to improve performance
- Ability to set common headers for all session requests
- Saving authorization parameters
Downloading Files from the Internet
The requests library makes it easy to download files of various formats from web servers.
Downloading Images and Documents
url = 'https://example.com/document.pdf'
response = requests.get(url)
with open('document.pdf', 'wb') as file:
file.write(response.content)
Downloading Large Files with Streaming
For large files, it is recommended to use streaming to save memory:
url = 'https://example.com/large-file.zip'
response = requests.get(url, stream=True)
with open('large-file.zip', 'wb') as file:
for chunk in response.iter_content(chunk_size=8192):
file.write(chunk)
Error and Exception Handling
Proper error handling is critical to building reliable network applications.
Main Types of Exceptions
try:
response = requests.get('https://api.github.com/invalid-url', timeout=5)
response.raise_for_status() # Raises an exception for HTTP errors
except requests.exceptions.HTTPError as err:
print(f"HTTP error: {err}")
except requests.exceptions.ConnectionError:
print("Error connecting to the server")
except requests.exceptions.Timeout:
print("Request timeout exceeded")
except requests.exceptions.RequestException as e:
print(f"An unexpected error occurred: {e}")
Checking Response Statuses
response = requests.get('https://api.example.com/data')
if response.status_code == 200:
print("Request was successful")
elif response.status_code == 404:
print("Resource not found")
elif response.status_code == 500:
print("Internal server error")
else:
print(f"Received status: {response.status_code}")
Setting Request Timeouts
Setting timeouts prevents the program from hanging when servers are slow or unavailable.
Types of Timeouts
# Connection and data read timeout
response = requests.get('https://httpbin.org/delay/10', timeout=(5, 30))
# Total timeout for the entire request
try:
response = requests.get('https://httpbin.org/delay/5', timeout=2)
print(response.text)
except requests.exceptions.Timeout:
print("Timeout exceeded")
Recommendations for Setting Timeouts
- Connection timeout: 3-5 seconds
- Read timeout: 15-30 seconds
- For APIs: 10-15 seconds
- For downloading files: 60-120 seconds
Working with Proxy Servers
Proxy servers are used to bypass geographical restrictions, ensure anonymity, or work through corporate networks.
Setting HTTP and HTTPS Proxies
proxies = {
'http': 'http://proxy-server.com:8080',
'https': 'https://proxy-server.com:8080'
}
response = requests.get('https://httpbin.org/ip', proxies=proxies)
print(response.text)
Proxies with Authentication
proxies = {
'http': 'http://username:password@proxy-server.com:8080',
'https': 'https://username:password@proxy-server.com:8080'
}
response = requests.get('https://example.com', proxies=proxies)
HTTP Authentication Methods
Modern web services use various authentication methods to protect data and control access.
HTTP Basic Authentication
from requests.auth import HTTPBasicAuth
response = requests.get(
'https://httpbin.org/basic-auth/user/pass',
auth=HTTPBasicAuth('user', 'pass')
)
print(response.status_code)
# Alternative method
response = requests.get(
'https://httpbin.org/basic-auth/user/pass',
auth=('user', 'pass')
)
Bearer Token Authentication
headers = {
'Authorization': 'Bearer YOUR_API_TOKEN_HERE',
'Content-Type': 'application/json'
}
response = requests.get('https://api.example.com/data', headers=headers)
print(response.json())
API Key Authentication
# In headers
headers = {'X-API-Key': 'your-api-key-here'}
response = requests.get('https://api.example.com/data', headers=headers)
# In URL parameters
params = {'api_key': 'your-api-key-here'}
response = requests.get('https://api.example.com/data', params=params)
Additional Requests Features
Sending Files to the Server
# Sending one file
files = {'file': open('document.txt', 'rb')}
response = requests.post('https://httpbin.org/post', files=files)
# Sending multiple files
files = {
'file1': open('document1.txt', 'rb'),
'file2': open('document2.txt', 'rb')
}
response = requests.post('https://httpbin.org/post', files=files)
Configuring Retries
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
session = requests.Session()
retry_strategy = Retry(
total=3, # Total number of attempts
backoff_factor=1, # Delay between attempts
status_forcelist=[429, 500, 502, 503, 504], # Statuses for retry
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session.mount('http://', adapter)
session.mount('https://', adapter)
response = session.get('https://unstable-api.example.com')
Frequently Asked Questions
What is Requests in Python?
Requests is a third-party Python library designed for making HTTP requests. It provides a simple and convenient interface for working with web services, APIs, and web scraping.
How to Solve the Problem of Request Blocking?
If a website blocks your requests, try the following methods:
- Add a realistic User-Agent header
- Use delays between requests
- Configure proxy server rotation
- Simulate real browser behavior
- Follow the site's
robots.txtrules
Differences Between Requests and Standard Urllib
The requests library has several advantages over the standard urllib:
- Simpler and more intuitive API
- Automatic cookie and session handling
- Built-in JSON support
- Better error handling
- Support for various authentication methods
- More readable code
Working with REST APIs
The requests library is ideal for working with REST APIs due to its support for all HTTP methods and automatic JSON processing:
# GET - retrieving data
response = requests.get('https://api.example.com/users/1')
user = response.json()
# POST - creating a new record
new_user = {'name': 'John', 'email': 'john@example.com'}
response = requests.post('https://api.example.com/users', json=new_user)
# PUT - updating a record
updated_user = {'name': 'John Updated', 'email': 'john.new@example.com'}
response = requests.put('https://api.example.com/users/1', json=updated_user)
# DELETE - deleting a record
response = requests.delete('https://api.example.com/users/1')
Useful Resources for Learning
For in-depth study of the requests library, it is recommended to familiarize yourself with the official documentation and additional materials:
- Official Documentation
- Code examples on GitHub
- Training articles and video tutorials
- Forums and Python developer communities
Conclusion
The requests library is a powerful and versatile tool for working with HTTP requests in Python. It is equally well suited for simple tasks of retrieving data from the Internet and for complex integrations with modern APIs and authentication systems.
Key benefits of using requests include ease of learning, reliability, extensive functionality, and active community support. When working with the library, it is important to remember security, properly handle possible errors, and always use trusted data sources.
By mastering the principles of working with requests, you will get a reliable tool for solving a wide range of tasks related to network interaction in Python applications.
The Future of AI in Mathematics and Everyday Life: How Intelligent Agents Are Already Changing the Game
Experts warned about the risks of fake charity with AI
In Russia, universal AI-agent for robots and industrial processes was developed