Work with text and binary files in Python: Methods for processing various data formats.

онлайн тренажер по питону
Online Python Trainer for Beginners

Learn Python easily without overwhelming theory. Solve practical tasks with automatic checking, get hints in Russian, and write code directly in your browser — no installation required.

Start Course

A self-study guide for Python 3 compiled from the materials on this site. Primarily intended for those who want to learn the Python programming language from scratch.

Working with files in Python: text and binary files

Working with files in Python is one of the main programming tasks. Python provides convenient tools for working with both text and binary files. Properly understanding the differences between these file types will help you efficiently process data in various formats.

Text files in Python

Text files contain data in the form of characters that can be read and interpreted by humans. These files include documents with extensions.txt, .csv, .json, .html and other files containing text information.

Text file access modes

Python provides the following modes for working with text files:

  • 'r' - reading the file (the file must exist)
  • 'w' - writing to a file (creates a new file or overwrites an existing one)
  • 'a' - adding data to the end of the file (creates a new file or adds data to an existing one)
  • 'r+' - read and write (the file must exist)
  • 'w+' - write and read (creates a new file or overwrites an existing one)
  • 'a+' - adding and reading (creates a new file or adds data to an existing one)

Examples of working with text files

# Reading from a text file
with open("text_file.txt", "r", encoding="utf-8") as file:
    data = file.read()
    print(data)

# Writing to a text file
with open("text_file.txt", "w", encoding="utf-8") as file:
    file.write("Hello, world!\n")
file.write("This is a text file in Russian.")

# Adding to a text file
with open("text_file.txt", "a", encoding="utf-8") as file:
    file.write("\Add a new line to the file.")

Binary files in Python

Binary files contain data in the form of a sequence of bytes that are not intended for human reading. These files include images (.jpg, .png), audio files (.mp3, .wav), video files (.mp4, .avi), archives (.zip, .rar) and executable files.

 

Binary file access modes

The following modes are used to work with binary files:

  • 'rb' - reading a binary file (the file must exist)
  • 'wb' - writing to a binary file (creates a new file or overwrites an existing one)
  • 'ab' - adding data to the end of the binary file
  • 'rb+' - reading and writing a binary file (the file must exist)
  • 'wb+' - writing and reading a binary file (creates a new file or overwrites an existing one)
  • 'ab+' - adding and reading a binary file

Examples of working with binary files

# Reading from a binary file
with open("binary_file.bin", "rb") as file:
    data = file.read()
    print(data)

# Writing to a binary file
with open("binary_file.bin", "wb") as file:
    file.write(b"\x48\x65\x6C\x6C\x6F\x2C\x20\x77\x6F\x72\x6C\x64\x21")  # "Hello, world!" in bytes

# Adding to a binary file
with open("binary_file.bin", "ab") as file:
    file.write(b"\x0A\x4E\x65\x77\x20\x61\x74\x61") #"\New data" in bytes

Context manager with in Python

Using the context manager with is the best practice when working with files in Python. It ensures that the file is automatically closed after operations are completed, even if an exception occurs during operation.

# The correct way to work with files
with open("example.txt", "r") as file:
    content = file.read()
# The file will automatically close after exiting the with block

Reading files line by line

For efficient processing of large text files, it is recommended to read them line by line:

with open("example.txt", "r", encoding="utf-8") as file:
    for line in file:
        print(line.strip())  # strip() deletes newline characters

Writing a list of lines to a file

Python allows you to write the entire list of strings in one operation:

lines = ["First line\n", "Second line\n", "Third line\n"]
with open("example.txt ", "w", encoding="utf-8") as file:
    file.writelines(lines)

Reading a file into a list of lines

To upload the entire contents of the file to the list, use the readlines() method:

with open("example.txt", "r", encoding="utf-8") as file:
    lines = file.readlines()
    print(lines)

Working with large files

When working with large files, it is important to save memory by reading the file in parts:

def read_large_file(filename, chunk_size=1024):
"""Reading a large file in parts"""
with open(filename, "r", encoding="utf-8") as file:
        while True:
            chunk = file.read(chunk_size)
            if not chunk:
                break
            yield chunk

# Using the function
for chunk in read_large_file("large_file.txt"):
    process_chunk(chunk) # Processing a part of a file

Error handling when working with files

It is important to properly handle possible errors when working with files:

try:
    with open("nonexistent_file.txt", "r") as file:
        content = file.read()
except FileNotFoundError:
    print("File not found")
except PermissionError:
    print("Insufficient permissions to access the file")
except Exception as e:
    print(f"Error occurred: {e}")

Checking the existence of a file

Before working with a file, it is useful to check its existence.:

import os

if os.path.exists("example.txt"):
    with open("example.txt", "r") as file:
        content = file.read()
else:
print("The file does not exist")

File encoding

When working with text files, it is important to specify the correct encoding, especially for Cyrillic files.:

# Explicit UTF-8 encoding
with open("russian_text.txt", "w", encoding="utf-8") as file:
    file.write("Text in Russian")

# Reading with encoding indication
with open("russian_text.txt", "r", encoding="utf-8") as file:
    content = file.read()

Conclusion

Working with files in Python requires understanding the differences between text and binary files, proper use of access modes, and mandatory use of the context manager with. This knowledge will allow you to efficiently process various types of files and avoid common mistakes when working with data.

categories

  • Introduction to Python
  • Python Programming Basics
  • Control Structures
  • Data Structures
  • Functions and Modules
  • Exception Handling
  • Working with Files and Streams
  • File System
  • Object-Oriented Programming (OOP)
  • Regular Expressions
  • Additional Topics
  • General Python Base