How to save data in Excel and CSV files?

онлайн тренажер по питону
Online Python Trainer for Beginners

Learn Python easily without overwhelming theory. Solve practical tasks with automatic checking, get hints in Russian, and write code directly in your browser — no installation required.

Start Course

Key Data Storage Formats in Python

Working with Excel and CSV files is a crucial skill for Python developers, especially in data analysis, report automation, and backend system development. Correctly saving data to files enables efficient storage, transfer, and processing of information between different systems.

This guide covers the main methods for saving data in popular formats using standard and third-party Python libraries. You'll learn practical examples of working with CSV and Excel files, mastering techniques for formatting and structuring data.

CSV Format: Features and Applications

What is CSV Format?

CSV (Comma-Separated Values) is a simple text format for storing tabular data. In this format, values are separated by commas or other delimiters, such as semicolons or tabs. CSV format is widely used for exchanging data between various applications and systems.

Advantages of CSV Format

  • Lightweight and readable text format
  • Quickly opens in any text editor, including Excel
  • Ideal for transferring data between systems
  • Small file size compared to binary formats
  • Supported by most data processing programs

Limitations of CSV Format

  • No support for complex formatting (colors, fonts)
  • Inability to store formulas and calculations
  • Limited support for different data types
  • No possibility of storing images and nested tables

Working with CSV via the Standard Library

Basics of Using the csv Module

The csv module is included in the standard Python distribution and requires no additional installation. It provides convenient tools for reading and writing CSV files with various formatting settings.

import csv

data = [
    ['Имя', 'Возраст', 'Город'],
    ['Иван', 28, 'Москва'],
    ['Мария', 25, 'Санкт-Петербург']
]

with open('people.csv', 'w', newline='', encoding='utf-8') as file:
    writer = csv.writer(file)
    writer.writerows(data)

After executing this code, a file named people.csv is created with the following content:

Имя,Возраст,Город
Иван,28,Москва
Мария,25,Санкт-Петербург

Setting Delimiters in CSV

The standard delimiter in CSV can be changed depending on system requirements or regional settings:

with open('people.csv', 'w', newline='', encoding='utf-8') as file:
    writer = csv.writer(file, delimiter=';')
    writer.writerows(data)

Working with DictWriter

For convenient work with data in the form of dictionaries, the DictWriter class is used:

import csv

data = [
    {'Имя': 'Иван', 'Возраст': 28, 'Город': 'Москва'},
    {'Имя': 'Мария', 'Возраст': 25, 'Город': 'Санкт-Петербург'}
]

with open('people.csv', 'w', newline='', encoding='utf-8') as file:
    fieldnames = ['Имя', 'Возраст', 'Город']
    writer = csv.DictWriter(file, fieldnames=fieldnames)
    writer.writeheader()
    writer.writerows(data)

Working with Excel Files via openpyxl

Installation and Configuration of the Library

Excel files are used when it is necessary to store structured data with formatting, formulas, or charts. The openpyxl library provides a complete set of tools for working with Excel files.

pip install openpyxl

Creating a Basic Excel File

from openpyxl import Workbook

data = [
    ['Имя', 'Возраст', 'Город'],
    ['Иван', 28, 'Москва'],
    ['Мария', 25, 'Санкт-Петербург']
]

wb = Workbook()
ws = wb.active

for row in data:
    ws.append(row)

wb.save('people.xlsx')

Adding Formatting in Excel

The openpyxl library allows you to apply various formatting styles to cells:

from openpyxl.styles import Font, PatternFill, Alignment

# Formatting the header
ws['A1'].font = Font(bold=True, color='FF0000')
ws['A1'].fill = PatternFill(start_color='FFFF00', end_color='FFFF00', fill_type='solid')
ws['A1'].alignment = Alignment(horizontal='center')

wb.save('people_styled.xlsx')

Working with Formulas

You can add calculated formulas to Excel files:

ws['D1'] = 'Возраст через 10 лет'
ws['D2'] = '=B2+10'
ws['D3'] = '=B3+10'

wb.save('people_with_formulas.xlsx')

Using pandas for Data Handling

Installing the pandas Library

The pandas library provides powerful tools for working with large amounts of data and supports many file formats:

pip install pandas openpyxl

Saving Data to CSV with pandas

import pandas as pd

data = {
    'Имя': ['Иван', 'Мария'],
    'Возраст': [28, 25],
    'Город': ['Москва', 'Санкт-Петербург']
}

df = pd.DataFrame(data)
df.to_csv('people_pandas.csv', index=False, sep=';', encoding='utf-8')

Creating Excel Files with pandas

df.to_excel('people_pandas.xlsx', index=False, engine='openpyxl')

Working with Multiple Excel Sheets

Pandas allows you to create Excel files with multiple sheets:

with pd.ExcelWriter('multi_sheet.xlsx', engine='openpyxl') as writer:
    df.to_excel(writer, sheet_name='Сотрудники', index=False)
    
    # Creating a second sheet with other data
    sales_data = pd.DataFrame({
        'Месяц': ['Январь', 'Февраль', 'Март'],
        'Продажи': [10000, 15000, 12000]
    })
    sales_data.to_excel(writer, sheet_name='Продажи', index=False)

Additional Features and Settings

Working with Encodings

When working with different systems, it is important to correctly specify the file encoding:

# For CSV files
df.to_csv('data_utf8.csv', encoding='utf-8', index=False)
df.to_csv('data_cp1251.csv', encoding='cp1251', index=False)

# For Excel, encoding is usually not required

Data Filtering When Saving

# Saving only certain columns
df[['Имя', 'Город']].to_csv('filtered_columns.csv', index=False)

# Saving filtered rows
filtered_df = df[df['Возраст'] > 26]
filtered_df.to_csv('filtered_rows.csv', index=False)

Configuring Recording Parameters

# Various parameters for CSV
df.to_csv('custom.csv', 
          sep=';',           # separator
          decimal=',',       # decimal separator
          index=False,       # do not save indexes
          header=True,       # save headers
          quoting=csv.QUOTE_ALL)  # enclose all values in quotes

Solving Typical Problems

Problems with Separators

When working with CSV files in different regions, problems may arise with separators:

# Using a semicolon for European systems
df.to_csv('european.csv', sep=';', decimal=',')

# Automatically detecting the separator when reading
df = pd.read_csv('unknown.csv', sep=None, engine='python')

Working with Large Files

For processing large amounts of data, use reading and writing in parts:

# Writing in parts
for chunk in pd.read_csv('large_file.csv', chunksize=10000):
    processed_chunk = chunk.copy()  # data processing
    processed_chunk.to_csv('output.csv', mode='a', header=False, index=False)

Saving Multi-Level Indexes

# Creating a DataFrame with a multi-level index
arrays = [['A', 'A', 'B', 'B'], ['one', 'two', 'one', 'two']]
tuples = list(zip(*arrays))
index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])

df_multi = pd.DataFrame({'values': [1, 2, 3, 4]}, index=index)

# Saving while preserving the index structure
df_multi.to_csv('multi_index.csv', index=True)
df_multi.to_excel('multi_index.xlsx', index=True)

Choosing the Optimal Approach

The choice of data saving method depends on the specific requirements of the project:

Use CSV when:

  • Simple data transfer between systems is required
  • Minimum file size is required
  • Data has a simple tabular structure
  • Compatibility with various programs is required

Choose Excel for:

  • Creating complex reports with formatting
  • Working with formulas and calculations
  • The need for multiple sheets in one file
  • Creating documents for data presentation

Apply pandas when:

  • Processing large arrays of data
  • The need for flexible data operations
  • Working with complex data structures
  • Integration with data analysis systems

News