Working with Excel files in Python: a complete guide
Working with Excel files in Python is carried out using specialized libraries that provide data reading, writing, and processing in formats formats.xlsx and .xls. Let's look at the four main libraries and their practical applications.
Basic libraries for working with Excel
openpyxl is a universal library for the modern Excel format (.xlsx). It supports reading, writing, and editing files, including formatting cells, creating charts, and working with formulas.
xlrd is a specialized library for reading old Excel (.xls) files. It is optimal for working with Excel 2003 and earlier versions of files.
pandas is a powerful data analysis library that simplifies working with Excel files thanks to the DataFrame data structure. It supports both formats and provides rich functionality for data processing.
xlsxwriter is a library focused on creating new Excel files with advanced formatting and data visualization capabilities.
Installing Libraries
pip installs openpyxl xlrd in pandas xlsxwriter
Practical examples of work
Reading data using openpyxl
imports load_workbook from openpyxl
# Uploading an Excel file
wb = load_workbook('example.xlsx')
sheet = wb.active
# Iterate through all rows
for each row in the sheet.iter_rows(values_only=True):
print(row)
# Reading a specific cell
cell_value = sheet['A1'].
value output(f"A1 cell value: {cell_value}")
Reading data using xlrd
xlrd import
# Opening an Excel file
wb = xlrd.open_workbook('example.xls ')
sheet = wb.sheet_by_index(0)
# Reading all lines
for a row in a range(sheet.nrows):
output(sheet.row_values(row))
# Reading a specific cell
cell_value = sheet.cell_value(0, 0)
output(f"Value of cell A1: {cell_value}")
Working with data via pandas
import pandas as pd
# Loading data from Excel to DataFrame
df = pd.read_excel('example.xlsx')
print(df.head())
#
Data processing df_filtered = df[df['column name'] > 100]
# Saving processed data
is df_filtered.to_excel('filtered_output.xlsx ', index=False)
# Working with multiple sheets
df_dict = pd.read_excel('example.xlsx", list_name=None)
Creating files using xlsxwriter
import xlsxwriter
# Creating a new Excel file
workbook = xlsxwriter.Workbook('formatted_output.xlsx ')
worksheet = workbook.add_worksheet()
# Adding data formatted
in bold = workbook.add_format({'bold': True})
worksheet.write ("A1", "Heading", bold)
a worksheet.record('A2', 'Data')
#
Chart creation chart = workbook.add_chart({'type': 'string'})
chart.adding_series({'values': '=Sheet1!$A$2:$A$10'})
worksheet.insert a picture('C1', diagram)
the workbook.close()
Advanced Excel Features
Creating and editing files using openpyxl
from openpyxl, we import the workbook
from openpyxl.styles import font, fill template
# Create a new file
wb = Workbook()
ws = wb.active
# Adding data
ws['A1'] = 'Name'
ws['B1'] = 'Value'
ws['A2'] = 'Product 1'
ws['B2'] = 100
# Formatting cells
ws['A1'].font = Font type(bold=True)
ws['A1'].fill = PatternFill(start color='FFFF00', end color='FFFF00', fill type='solid')
# Saving
the wb file.save('formatted_example.xlsx')
Processing large files using pandas
importing pandas in pd format
# Reading a file in parts for big data
, fragment size = 1000
fragments = []
for the fragment in pd.read_excel('large_file.xlsx', fragment size=fragment size):
processed fragment = chunk.groupby('category').sum()
fragments.add(processed fragment)
# Combining processed parts
result = pd.concat(fragments, ignores_index=True)
result.to_excel('processed_large_file.xlsx', index=False)
Choosing the appropriate library
Use openpyxl if you need:
- Working with modern Excel files (.xlsx)
- Editing existing files
- Applying formatting and styles
- Working with formulas
Select xlrd for:
- Reading old Excel (.xls) files
- Simple data extraction
- Working with legacy systems
If necessary, use pandas:
- Data analysis and processing
- Working with large amounts of information
- Integration with other data sources
- Statistical processing
Use xlsxwriter for:
- Creating new files from scratch
- Creating reports with diagrams
- Creating files with advanced formatting
- Automating document creation
Error handling and optimization
import pandas as pd
from openpyxl import load_workbook
try:
# Safe file reading
df = pd.read_excel('data.xlsx')
# Checking data availability
if df.empty:
print("File is empty")
more:
print(f"Loaded {len(df)} lines")
except for the FileNotFoundError:
print("File not found")
except for the exception in the form of e:
print("Error reading file: {e}")
Here is a detailed table of the methods of the main libraries for working with Excel in Python:
pandas (for working with Excel)
method| The | Description | Usage example |
|---|---|---|
pd.read_excel() |
Reading an Excel file | df = pd.read_excel('file.xlsx') |
df.to_excel() |
Writing to an Excel file | df.to_excel('output.xlsx ') |
pd.ExcelFile() |
Creating an Excel file object | xls = pd.ExcelFile('file.xlsx') |
xls.sheet_names |
Getting a list of sheets | sheets = xls.sheet_names |
xls.parse() |
Reading a specific sheet | df = xls.parse('Sheet 1') |
openpyxl (base classes and methods)
Workbook
<-th> method| Description | Usage example | |
|---|---|---|
Workbook() |
Creating a new book | wb = Workbook() |
load_workbook() |
Loading an existing workbook | wb = load_workbook('file.xlsx') |
wb.save() |
Saving a book | wb.save('file.xlsx') |
wb.create_sheet() |
Creating a new sheet | ws = wb.create_sheet('New sheet') |
wb.remove() |
Deleting a sheet | wb.delete(wb['Sheet1']) |
wb.sheet names |
List of sheet names | names = wb.sheetnames |
wb.active |
Active sheet | ws = wb.active |
wb.copy_worksheet() |
Copying a sheet | wb.copy_worksheet(ws) |
Worksheet
Method DescriptionUsage examplews.cell() |
Getting/setting the cell value | ws.cell(row=1, column=1, value='Hello') |
ws.append() |
Adding a line | ws.add(['A', 'B', 'C']) |
ws.insert_rows() |
Inserting lines | ws.insert_rows(1, 3) |
ws.insert_cols() |
Inserting columns | ws.insert_cols(1, 2) |
ws.delete_rows() |
Deleting lines | ws.delete_rows(1, 2) |
ws.delete_cols() |
Deleting columns | ws.delete_cols(1, 2) |
ws.max_row |
Maximum row with data | max_r = ws.max_row |
ws.max_column |
Maximum data column | max_c = ws.max_column |
ws.iter_rows() |
Iterating through the lines | for the string in ws.iter_rows(): |
ws.iter_cols() |
Going through the columns | for col in ws.iter_cols(): |
ws.merge_cells() |
Combining cells | ws.merge_cells('A1:B2') |
ws.merge_cells() |
Cell separation | ws.merge_cells('A1:B2') |
Cell
is a property/method Description Usage examplecell.value |
Cell value | val = cell.value |
cell.coordinate |
Cell coordinates | coord = cell.coordinate |
cell.row |
Line number | row = cell.row |
|||
cell.column |
Column number | col = cell.column |
|||
cell.column_letter |
Column letter | letter = cell.column_letter |
|||
cell.font |
Cell font | cell.font = Font(bold=True) |
|||
cell.fill |
Cell filling | cell.fill = PatternFill(fill_type='solid') |
|||
cell.border |
Cell boundaries | cell.border = Border(left=Side()) |
|||
cell.alignment |
Alignment | cell.alignment = Alignment(horizontal='center') |
|||
cell.number_format |
Number format | cell.number_format = '0.00' |
xlsxwriter (basic methods)
Workbook
method format| The | Description | Usage example |
|---|---|---|
xlsxwriter.Workbook() |
Creating a new book | wb = xlsxwriter.Workbook('file.xlsx') |
wb.add_worksheet() |
Adding a sheet | ws = wb.add_worksheet() |
wb.add_format() |
Creating the | fmt = wb.add_format({'bold': True}) |
wb.close() |
Closing and saving | wb.close() |
Worksheet
method| The | Description | Usage example |
|---|---|---|
ws.write() |
Data recording | ws.write(0, 0, 'Hello') |
ws.write_row() |
Writing a line | ws.write_row(0, 0, ['A', 'B', 'C']) |
ws.write_column() |
Column entry | ws.write_column(0, 0, [1, 2, 3]) |
ws.write_formula() |
Writing the formula | ws.write_formula(0, 0, '=A1+B1') |
ws.insert_image() |
Inserting an image | ws.insert_image('A1', 'image.png') |
ws.add_chart() |
Adding a chart | chart = wb.add_chart({'type': 'column'}) |
ws.set_column() |
Column Setting | ws.set_column('A:A', 20) |
ws.set_row() |
String Setting | ws.set_row(0, 30) |
ws.merge_range() |
Combining cells | ws.merge_range('A1:B2', 'Text') |
ws.freeze_panes() |
Fixing panels | ws.freeze_panes(1, 0) |
ws.autofilter() |
Autofilter | ws.autofilter('A1:D10') |
xlrd (for reading old files .xls)
Workbook
method| The | Description | Usage example |
|---|---|---|
xlrd.open_workbook() |
Opening a file | wb = xlrd.open_workbook('file.xls') |
wb.sheet_names() |
Sheet names | names = wb.sheet_names() |
wb.sheet_by_index() |
Index sheet | ws = wb.sheet_by_index(0) |
wb.sheet_by_name() |
Sheet by name | ws = wb.sheet_by_name('Sheet1') |
wb.nsheets |
Number of sheets | count = wb.nsheets |
Worksheet
method| The | Description | Usage example |
|---|---|---|
ws.cell_value() |
Cell value | val = ws.cell_value(0, 0) |
ws.cell_type() |
Cell type | type = ws.cell_type(0, 0) |
ws.nrows |
Number of lines | rows = ws.nrows |
ws.ncols |
Number of columns | cols = ws.ncols |
ws.row_values() |
String values | row = ws.row_values(0) |
ws.col_values() |
Column values | col = ws.col_values(0) |
xlwt (for writing to .xls)
Workbook
method| The | Description | Usage example |
|---|---|---|
xlwt.Workbook() |
Creating a book | wb = xlwt.Workbook() |
wb.add_sheet() |
Adding a sheet | ws = wb.add_sheet('Sheet1') |
wb.save() |
Saving | wb.save('file.xls') |
Worksheet
method| The | Description | Usage example |
|---|---|---|
ws.write() |
Data recording | ws.write(0, 0, 'Hello') |
ws.write_merge() |
Record with union | ws.write_merge(0, 1, 0, 1, 'Text') |