Selenium - Automation of the browser

онлайн тренажер по питону
Online Python Trainer for Beginners

Learn Python easily without overwhelming theory. Solve practical tasks with automatic checking, get hints in Russian, and write code directly in your browser — no installation required.

Start Course

What Is Selenium and Why Do You Need It

Selenium is a powerful open‑source framework designed for automating actions in web browsers. This library enables developers to programmatically control browsers, mimicking real user interactions: navigating pages, entering data into forms, clicking elements, handling dropdowns, and many other operations.

Selenium is widely used across various development and testing domains. Its primary use cases include automated UI testing, dynamic website web‑scraping, automating repetitive browser tasks, and verifying web‑application functionality.

Key Features and Benefits of Selenium

Cross‑Platform Browser Support

Selenium provides full compatibility with all major browsers, including Google Chrome, Mozilla Firefox, Microsoft Edge, Safari, and Opera. This means the same code can run across different browsers with minimal changes.

Realistic User Action Simulation

The framework can accurately reproduce virtually any user action: left‑ and right‑clicks, drag‑and‑drop, page scrolling, keyboard input, handling modal windows, and more.

Flexible Element Locators

Selenium offers numerous ways to locate elements on a page: by ID, class, tag name, CSS selectors, XPath expressions, link text, and other attributes. This provides maximum flexibility when working with diverse web pages.

Headless Mode Support

Running browsers in headless mode (without a GUI) makes Selenium an ideal tool for automation on servers, in containers, and CI/CD pipelines.

Advanced Capabilities

Selenium can work with multiple tabs and windows, handle JavaScript alerts, manage cookies and sessions, configure proxies, capture screenshots, and save page HTML.

Installation and Setup of Selenium

Installing the Core Library

To start using Selenium, install the core library via pip:

pip install selenium

Downloading and Configuring Browser Drivers

Each browser requires its corresponding driver. The most popular is ChromeDriver for Chrome. Starting with Chrome 115, it is recommended to use Chrome for Testing, which can be downloaded from the official Google site.

Make sure the driver version matches the installed browser version. Automation tools like webdriver-manager can simplify this process:

pip install webdriver-manager

Automatic Driver Management

A modern approach to driver handling uses a driver manager:

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service

service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service)

Fundamentals of Working with Selenium

Launching a Browser and Basic Operations

A basic example of launching a browser and navigating pages:

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

# Configure browser options
options = Options()
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')

# Initialize the driver
driver = webdriver.Chrome(options=options)

# Navigate to pages
driver.get("https://example.com")
print(f"Page title: {driver.title}")
print(f"Current URL: {driver.current_url}")

# Close the browser
driver.quit()

Navigation Control

Selenium provides complete control over browser navigation:

# Core navigation methods
driver.get("https://example.com")  # Navigate to URL
driver.refresh()                   # Refresh the page
driver.back()                      # Go back
driver.forward()                   # Go forward in history

# Retrieve page information
page_title = driver.title
current_url = driver.current_url
page_source = driver.page_source

Finding and Interacting with Elements

Element Locator Strategies

Selenium offers many strategies for locating elements on a page:

from selenium.webdriver.common.by import By

# Locate by various attributes
element_by_id = driver.find_element(By.ID, "username")
element_by_name = driver.find_element(By.NAME, "email")
element_by_class = driver.find_element(By.CLASS_NAME, "login-form")
element_by_tag = driver.find_element(By.TAG_NAME, "button")
element_by_css = driver.find_element(By.CSS_SELECTOR, ".navbar a[href='/login']")
element_by_xpath = driver.find_element(By.XPATH, "//input[@type='password']")
element_by_link_text = driver.find_element(By.LINK_TEXT, "Log In")
element_by_partial_link = driver.find_element(By.PARTIAL_LINK_TEXT, "Register")

# Locate multiple elements
elements = driver.find_elements(By.CLASS_NAME, "item")

Interacting with Forms and Elements

Selenium enables a variety of actions on located elements:

# Working with text fields
username_field = driver.find_element(By.ID, "username")
username_field.clear()                    # Clear the field
username_field.send_keys("admin")         # Enter text

# Working with buttons
submit_button = driver.find_element(By.XPATH, "//button[@type='submit']")
submit_button.click()                     # Click the button

# Retrieve element information
element_text = submit_button.text
element_attribute = submit_button.get_attribute("class")
is_displayed = submit_button.is_displayed()
is_enabled = submit_button.is_enabled()
is_selected = submit_button.is_selected()

Form Handling

Selenium simplifies web‑form operations:

# Locate and submit a form
form = driver.find_element(By.ID, "loginForm")
form.submit()  # Submit the form

# Alternative: click the submit button
submit_button = driver.find_element(By.XPATH, "//input[@type='submit']")
submit_button.click()

Working with Dropdowns and Checkboxes

Managing Dropdown Lists

For <select> elements Selenium provides a dedicated class:

from selenium.webdriver.support.ui import Select

# Locate the dropdown
dropdown = driver.find_element(By.ID, "country")
select = Select(dropdown)

# Various selection methods
select.select_by_visible_text("Russia")       # By visible text
select.select_by_value("RU")                  # By value
select.select_by_index(0)                     # By index

# Retrieve selected options
selected_option = select.first_selected_option
all_selected = select.all_selected_options
all_options = select.options

# Deselect (for multi‑select)
select.deselect_all()
select.deselect_by_visible_text("Russia")

Checkboxes and Radio Buttons

# Checkbox handling
checkbox = driver.find_element(By.ID, "agree")
if not checkbox.is_selected():
    checkbox.click()

# Radio button handling
radio_button = driver.find_element(By.XPATH, "//input[@type='radio'][@value='option1']")
radio_button.click()

Waits in Selenium

Implicit Waits

Implicit waits set a global timeout for all element‑search operations:

# Set an implicit wait
driver.implicitly_wait(10)  # Wait up to 10 seconds

# All subsequent find operations will wait up to 10 seconds
element = driver.find_element(By.ID, "dynamic-element")

Explicit Waits

Explicit waits provide precise control over waiting for specific conditions:

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Create a wait object
wait = WebDriverWait(driver, 10)

# Wait for element presence
element = wait.until(
    EC.presence_of_element_located((By.ID, "success-message"))
)

# Wait for element to be clickable
clickable_element = wait.until(
    EC.element_to_be_clickable((By.ID, "submit-button"))
)

# Wait for element visibility
visible_element = wait.until(
    EC.visibility_of_element_located((By.CLASS_NAME, "alert"))
)

# Wait for element to disappear
wait.until(
    EC.invisibility_of_element_located((By.ID, "loading-spinner"))
)

Window and Tab Management

Handling Multiple Tabs

Selenium can control several browser tabs:

# Open a new tab
driver.execute_script("window.open('https://example.com', '_blank');")

# Get list of all tabs
window_handles = driver.window_handles

# Switch between tabs
driver.switch_to.window(window_handles[1])  # Switch to second tab
driver.switch_to.window(window_handles[0])  # Return to first tab

# Close the current tab
driver.close()

# Switch to the remaining tab
driver.switch_to.window(window_handles[0])

Working with Frames

# Switch to a frame
driver.switch_to.frame("frame-name")  # By name
driver.switch_to.frame(0)             # By index
frame_element = driver.find_element(By.ID, "frame-id")
driver.switch_to.frame(frame_element)  # By element

# Return to default content
driver.switch_to.default_content()

# Switch to parent frame
driver.switch_to.parent_frame()

Handling JavaScript Alerts

Working with Modal Dialogs

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

# Wait for an alert to appear
wait = WebDriverWait(driver, 10)
alert = wait.until(EC.alert_is_present())

# Get alert text
alert_text = alert.text
print(f"Alert text: {alert_text}")

# Accept the alert
alert.accept()

# Dismiss the alert (for confirm dialogs)
alert.dismiss()

# Send text to a prompt
alert.send_keys("Entered text")
alert.accept()

Advanced Capabilities

Executing JavaScript

Selenium can run arbitrary JavaScript code:

# Execute JavaScript
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")

# Return a value from JavaScript
page_height = driver.execute_script("return document.body.scrollHeight;")

# Interact with elements via JavaScript
element = driver.find_element(By.ID, "hidden-button")
driver.execute_script("arguments[0].click();", element)

# Modify element styles
driver.execute_script("arguments[0].style.border='3px solid red';", element)

Cookie Management

# Add a cookie
driver.add_cookie({
    'name': 'session_id',
    'value': 'abc123',
    'domain': '.example.com',
    'path': '/',
    'secure': True,
    'httpOnly': False
})

# Retrieve all cookies
all_cookies = driver.get_cookies()

# Get a specific cookie
session_cookie = driver.get_cookie('session_id')

# Delete a cookie
driver.delete_cookie('session_id')

# Delete all cookies
driver.delete_all_cookies()

Taking Screenshots

# Full‑page screenshot
driver.save_screenshot("full_page.png")

# Screenshot of a specific element
element = driver.find_element(By.ID, "content")
element.screenshot("element.png")

# Screenshot as base64 (useful for integrations)
import base64
screenshot_base64 = driver.get_screenshot_as_base64()

Configuring Headless Mode

Headless Configuration

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

# Set up Chrome in headless mode
chrome_options = Options()
chrome_options.add_argument("--headless")
chrome_options.add_argument("--no-sandbox")
chrome_options.add_argument("--disable-dev-shm-usage")
chrome_options.add_argument("--disable-gpu")
chrome_options.add_argument("--window-size=1920,1080")

driver = webdriver.Chrome(options=chrome_options)

# Equivalent setup for Firefox
firefox_options = webdriver.FirefoxOptions()
firefox_options.add_argument("--headless")
firefox_driver = webdriver.Firefox(options=firefox_options)

Integration with Test Frameworks

Using with pytest

import pytest
from selenium import webdriver
from selenium.webdriver.common.by import By

@pytest.fixture
def driver():
    driver = webdriver.Chrome()
    yield driver
    driver.quit()

def test_login(driver):
    driver.get("https://example.com/login")
    
    username_field = driver.find_element(By.ID, "username")
    password_field = driver.find_element(By.ID, "password")
    submit_button = driver.find_element(By.ID, "submit")
    
    username_field.send_keys("testuser")
    password_field.send_keys("password123")
    submit_button.click()
    
    assert "dashboard" in driver.current_url

Using with unittest

import unittest
from selenium import webdriver
from selenium.webdriver.common.by import By

class LoginTest(unittest.TestCase):
    def setUp(self):
        self.driver = webdriver.Chrome()
    
    def tearDown(self):
        self.driver.quit()
    
    def test_successful_login(self):
        self.driver.get("https://example.com/login")
        
        username_field = self.driver.find_element(By.ID, "username")
        password_field = self.driver.find_element(By.ID, "password")
        
        username_field.send_keys("admin")
        password_field.send_keys("password")
        
        submit_button = self.driver.find_element(By.ID, "submit")
        submit_button.click()
        
        self.assertIn("dashboard", self.driver.current_url)

if __name__ == "__main__":
    unittest.main()

Integration with Other Libraries

Combining Selenium with BeautifulSoup

from selenium import webdriver
from bs4 import BeautifulSoup

driver = webdriver.Chrome()
driver.get("https://example.com")

# Get page HTML
html = driver.page_source

# Parse with BeautifulSoup
soup = BeautifulSoup(html, 'html.parser')

# Extract data
title = soup.title.text
links = soup.find_all('a')
paragraphs = soup.find_all('p')

driver.quit()

Using Selenium with pandas for Data Analysis

import pandas as pd
from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.get("https://example.com/table")

# Locate the table
table = driver.find_element(By.ID, "data-table")

# Extract rows
rows = table.find_elements(By.TAG_NAME, "tr")
data = []

for row in rows:
    cells = row.find_elements(By.TAG_NAME, "td")
    if cells:
        data.append([cell.text for cell in cells])

# Create a DataFrame
df = pd.DataFrame(data, columns=['Column1', 'Column2', 'Column3'])
print(df)

driver.quit()

Performance Optimization

Speed‑Up Strategies

from selenium import webdriver
from selenium.webdriver.chrome.options import Options

# Optimized Chrome settings
options = Options()
options.add_argument("--headless")
options.add_argument("--no-sandbox")
options.add_argument("--disable-dev-shm-usage")
options.add_argument("--disable-gpu")
options.add_argument("--disable-images")
options.add_argument("--disable-javascript")  # If JS is not needed
options.add_argument("--disable-plugins")
options.add_argument("--disable-extensions")

# Block image loading
prefs = {
    "profile.managed_default_content_settings.images": 2,
    "profile.default_content_setting_values.notifications": 2
}
options.add_experimental_option("prefs", prefs)

driver = webdriver.Chrome(options=options)

Using a Driver Pool

from selenium import webdriver
from concurrent.futures import ThreadPoolExecutor
import threading

class WebDriverPool:
    def __init__(self, size=3):
        self.drivers = []
        self.lock = threading.Lock()
        
        for _ in range(size):
            driver = webdriver.Chrome()
            self.drivers.append(driver)
    
    def get_driver(self):
        with self.lock:
            if self.drivers:
                return self.drivers.pop()
            return None
    
    def return_driver(self, driver):
        with self.lock:
            self.drivers.append(driver)
    
    def close_all(self):
        for driver in self.drivers:
            driver.quit()

Selenium Methods and Functions Table

Category Method / Function Description Example Usage
Initialization webdriver.Chrome() Launch Chrome browser driver = webdriver.Chrome()
  webdriver.Firefox() Launch Firefox browser driver = webdriver.Firefox()
  webdriver.Edge() Launch Edge browser driver = webdriver.Edge()
  webdriver.Safari() Launch Safari browser driver = webdriver.Safari()
Navigation driver.get(url) Navigate to URL driver.get("https://example.com")
  driver.back() Go back to previous page driver.back()
  driver.forward() Go forward in history driver.forward()
  driver.refresh() Refresh the page driver.refresh()
Element Finding find_element(By.ID, id) Find by ID element = driver.find_element(By.ID, "username")
  find_element(By.NAME, name) Find by name element = driver.find_element(By.NAME, "email")
  find_element(By.CLASS_NAME, class) Find by class element = driver.find_element(By.CLASS_NAME, "button")
  find_element(By.TAG_NAME, tag) Find by tag element = driver.find_element(By.TAG_NAME, "input")
  find_element(By.CSS_SELECTOR, selector) Find by CSS selector element = driver.find_element(By.CSS_SELECTOR, ".nav a")
  find_element(By.XPATH, xpath) Find by XPath element = driver.find_element(By.XPATH, "//button[@type='submit']")
  find_element(By.LINK_TEXT, text) Find by link text element = driver.find_element(By.LINK_TEXT, "Log In")
  find_element(By.PARTIAL_LINK_TEXT, text) Find by partial link text element = driver.find_element(By.PARTIAL_LINK_TEXT, "Register")
  find_elements(locator) Find multiple elements elements = driver.find_elements(By.CLASS_NAME, "item")
Interaction element.click() Click an element button.click()
  element.send_keys(text) Enter text input_field.send_keys("text")
  element.clear() Clear a field input_field.clear()
  element.submit() Submit a form form.submit()
  element.text Get element text text = element.text
  element.get_attribute(attr) Get attribute value value = element.get_attribute("class")
  element.is_displayed() Check visibility is_visible = element.is_displayed()
  element.is_enabled() Check enabled state is_active = element.is_enabled()
  element.is_selected() Check selection state is_checked = element.is_selected()
Page Information driver.title Page title title = driver.title
  driver.current_url Current URL url = driver.current_url
  driver.page_source Page HTML source html = driver.page_source
Windows and Frames driver.window_handles List of open windows/tabs handles = driver.window_handles
  driver.switch_to.window(handle) Switch between windows/tabs driver.switch_to.window(handles[1])
  driver.switch_to.frame(frame) Switch to a frame driver.switch_to.frame("frame-name")
  driver.switch_to.default_content() Return to main content driver.switch_to.default_content()
  driver.switch_to.parent_frame() Switch to parent frame driver.switch_to.parent_frame()
Waits driver.implicitly_wait(seconds) Implicit wait driver.implicitly_wait(10)
  WebDriverWait(driver, timeout) Create a wait object wait = WebDriverWait(driver, 10)
  wait.until(condition) Wait for a condition wait.until(EC.presence_of_element_located((By.ID, "element")))
Dropdowns Select(element) Create a Select object select = Select(dropdown)
  select.select_by_visible_text(text) Select by visible text select.select_by_visible_text("Option")
  select.select_by_value(value) Select by value select.select_by_value("option1")
  select.select_by_index(index) Select by index select.select_by_index(0)
  select.deselect_all() Deselect all options select.deselect_all()
Alerts driver.switch_to.alert Switch to alert alert = driver.switch_to.alert
  alert.accept() Accept alert alert.accept()
  alert.dismiss() Dismiss alert alert.dismiss()
  alert.text Alert text message = alert.text
  alert.send_keys(text) Send input to prompt alert.send_keys("text")
JavaScript driver.execute_script(script) Execute JavaScript driver.execute_script("window.scrollTo(0, 500)")
  driver.execute_script(script, *args) Execute JS with arguments driver.execute_script("arguments[0].click()", element)
Screenshots driver.save_screenshot(filename) Capture full‑page screenshot driver.save_screenshot("page.png")
  element.screenshot(filename) Capture element screenshot element.screenshot("element.png")
  driver.get_screenshot_as_base64() Screenshot as base64 string base64_img = driver.get_screenshot_as_base64()
Cookies driver.add_cookie(cookie_dict) Add a cookie driver.add_cookie({"name": "session", "value": "123"})
  driver.get_cookies() Get all cookies cookies = driver.get_cookies()
  driver.get_cookie(name) Get a specific cookie cookie = driver.get_cookie("session")
  driver.delete_cookie(name) Delete a cookie driver.delete_cookie("session")
  driver.delete_all_cookies() Delete all cookies driver.delete_all_cookies()
Termination driver.close() Close current window driver.close()
  driver.quit() Quit the WebDriver session driver.quit()

Deployment and Automation

CI/CD Integration

Selenium integrates smoothly with popular CI/CD platforms:

GitHub Actions:

name: Selenium Tests
on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v2
    - name: Set up Python
      uses: actions/setup-python@v2
      with:
        python-version: 3.9
    - name: Install dependencies
      run: |
        pip install selenium pytest webdriver-manager
    - name: Run tests
      run: |
        pytest tests/

Jenkins Pipeline:

pipeline {
    agent any
    stages {
        stage('Test') {
            steps {
                sh 'pip install selenium pytest'
                sh 'pytest --html=report.html tests/'
            }
        }
    }
}

Using Selenium in Docker

FROM python:3.9-slim

# Install Chrome and ChromeDriver
RUN apt-get update && apt-get install -y \
    wget \
    gnupg \
    unzip \
    && wget -q -O - https://dl-ssl.google.com/linux/linux_signing_key.pub | apt-key add - \
    && echo "deb [arch=amd64] http://dl.google.com/linux/chrome/deb/ stable main" >> /etc/apt/sources.list.d/google.list \
    && apt-get update \
    && apt-get install -y google-chrome-stable

# Install Python dependencies
COPY requirements.txt .
RUN pip install -r requirements.txt

# Copy test suite
COPY tests/ /app/tests/
WORKDIR /app

# Run tests
CMD ["python", "-m", "pytest", "tests/"]

Frequently Asked Questions

What is Selenium and what is it used for?

Selenium is a suite of tools for automating web browsers. It is used for automated testing of web applications, web scraping, automating repetitive browser tasks, and verifying user interfaces.

Which browsers does Selenium support?

Selenium supports all major browsers: Google Chrome, Mozilla Firefox, Microsoft Edge, Safari, Opera, and Internet Explorer. Each browser requires its corresponding driver.

Can Selenium be used without a graphical interface?

Yes, Selenium supports headless mode for most browsers. This is especially useful for automation on servers, in Docker containers, and CI/CD pipelines.

Is Selenium suitable for web scraping?

Selenium excels at scraping dynamic sites where content is rendered via JavaScript. For simple static sites, tools like requests + BeautifulSoup may be more efficient.

How can I speed up Selenium?

To improve speed, use headless mode, disable image loading, replace fixed time.sleep calls with explicit waits, and consider a driver pool for parallel execution.

What are alternatives to Selenium?

Popular alternatives include Playwright, Puppeteer, Cypress for testing, and Scrapy with Splash for web scraping.

How do I handle dynamic content?

Use explicit waits with WebDriverWait and appropriate expected_conditions to wait for elements to load instead of using static delays.

Can Selenium tests run in parallel?

Yes, Selenium supports parallel test execution. You can use pytest-xdist with pytest or TestNG for Java. Each test should instantiate its own WebDriver instance.

Selenium remains one of the most powerful and popular tools for browser automation. Its flexibility, cross‑platform nature, and extensive ecosystem make it indispensable for developers and testers of web applications.

 

News