How to parse JSON in Python

онлайн тренажер по питону
Online Python Trainer for Beginners

Learn Python easily without overwhelming theory. Solve practical tasks with automatic checking, get hints in Russian, and write code directly in your browser — no installation required.

Start Course

What is JSON and Why Parse It?

JSON (JavaScript Object Notation) is a lightweight text-based data interchange format that has become a standard for transmitting information between applications. Despite its name, JSON is independent of JavaScript and is actively used in all modern programming languages.

Key advantages of JSON:

  • Human-readable format
  • Simple data structure
  • Universal support
  • Compactness compared to XML

Example of a JSON object:

{
  "name": "Alice",
  "age": 30,
  "is_active": true,
  "hobbies": ["reading", "cycling", "traveling"]
}

Built-in json Module in Python

Python provides a powerful built-in json module for working with JSON data. This module allows you to perform two main operations:

  • Deserialization — converting a JSON string to Python objects
  • Serialization — converting Python objects to a JSON string

To get started, import the module:

import json

Parsing JSON Strings into Python Objects

Using json.loads()

The json.loads() (load string) function converts a JSON string into a corresponding Python object:

import json

json_data = '{"name": "Bob", "age": 25, "is_active": false}'
parsed_data = json.loads(json_data)

print(parsed_data)  # {'name': 'Bob', 'age': 25, 'is_active': False}
print(type(parsed_data))  # <class 'dict'>

After parsing, you can work with the data as with a regular dictionary:

print(parsed_data['name'])  # Bob
print(parsed_data['age'])   # 25

Data Type Correspondence

JSON types are automatically converted to Python types:

JSON Python
object dict
array list
string str
number int/float
true/false True/False
null None

Converting Python Objects to JSON

Using json.dumps()

The json.dumps() function converts Python objects into a JSON string:

data = {'name': 'Alice', 'age': 30, 'is_active': True}
json_string = json.dumps(data)
print(json_string)  # {"name": "Alice", "age": 30, "is_active": true}

Formatting Parameters

For improved readability, use additional parameters:

data = {'name': 'Alice', 'age': 30, 'city': 'Moscow'}
json_pretty = json.dumps(data, indent=4, sort_keys=True, ensure_ascii=False)
print(json_pretty)

Key parameters:

  • indent — indents for formatting
  • sort_keys — sorting keys
  • ensure_ascii — Unicode character support

Working with JSON Files

Reading JSON from a File

Use json.load() to read JSON data from a file:

with open('data.json', 'r', encoding='utf-8') as file:
    content = json.load(file)
    print(content)

Writing JSON to a File

The json.dump() function writes data to a file:

data = {'city': 'Moscow', 'temperature': -5, 'weather': 'снег'}

with open('weather.json', 'w', encoding='utf-8') as file:
    json.dump(data, file, indent=4, ensure_ascii=False)

Recommendations for Working with Files

  • Always specify the utf-8 encoding.
  • Use the context manager with.
  • Add ensure_ascii=False for correct display of Cyrillic.

Handling Nested JSON Structures

Accessing Nested Data

Example of a complex JSON structure:

{
  "user": {
    "name": "Charlie",
    "contacts": {
      "email": "charlie@example.com",
      "phone": "123456789"
    },
    "preferences": {
      "theme": "dark",
      "notifications": true
    }
  }
}

Parsing and accessing data:

import json

json_data = '''
{
  "user": {
    "name": "Charlie",
    "contacts": {
      "email": "charlie@example.com",
      "phone": "123456789"
    }
  }
}
'''

parsed = json.loads(json_data)
email = parsed['user']['contacts']['email']
print(email)  # charlie@example.com

Safe Data Access

Use the get() method to avoid errors:

email = parsed.get('user', {}).get('contacts', {}).get('email', 'Не указан')

Handling Errors When Working with JSON

Common Error Types

The following errors may occur when working with JSON:

import json

try:
    # Invalid JSON
    invalid_json = '{"name": "Alice", "age": 30,'
    data = json.loads(invalid_json)
except json.JSONDecodeError as e:
    print(f"JSON decoding error: {e}")
    print(f"Error position: {e.pos}")
    print(f"Line: {e.lineno}, Column: {e.colno}")

Validating JSON Data

Create a function to check the validity of JSON:

def is_valid_json(json_string):
    try:
        json.loads(json_string)
        return True
    except json.JSONDecodeError:
        return False

# Example usage
test_json = '{"name": "Test"}'
if is_valid_json(test_json):
    print("JSON is valid")

Working with Large JSON Files

Memory Issues

When processing large JSON files (several GB), standard methods can cause memory overflow.

Using ijson

For streaming processing of large files, use the ijson library:

pip install ijson
import ijson

def parse_large_json(filename):
    with open(filename, 'rb') as file:
        parser = ijson.parse(file)
        for prefix, event, value in parser:
            if prefix == 'items.item.name':
                print(f"Found name: {value}")

Alternative Libraries for Working with JSON

ujson — High Performance

ujson (Ultra JSON) provides significantly faster processing speeds:

pip install ujson
import ujson

data = {'name': 'FastParser', 'version': 1}
json_string = ujson.dumps(data)
parsed_data = ujson.loads(json_string)

orjson — Fastest Library

orjson is considered the fastest JSON library for Python:

pip install orjson
import orjson

data = {'name': 'SuperFast', 'version': 2}
json_bytes = orjson.dumps(data)
parsed_data = orjson.loads(json_bytes)

Performance Comparison

Library Parsing Speed Serialization Speed Features
json Basic Basic Built-in, full support
ujson 2-3x faster 1.5-2x faster Good compatibility
orjson 3-5x faster 2-4x faster Returns bytes

Practical Examples of Use

Working with API

import requests
import json

def get_user_data(username):
    try:
        response = requests.get(f'https://api.github.com/users/{username}')
        response.raise_for_status()
        
        user_data = response.json()
        return {
            'name': user_data.get('name', 'Не указано'),
            'public_repos': user_data.get('public_repos', 0),
            'followers': user_data.get('followers', 0)
        }
    except requests.RequestException as e:
        print(f"Request error: {e}")
        return None

# Usage
user_info = get_user_data('octocat')
if user_info:
    print(json.dumps(user_info, indent=2, ensure_ascii=False))

Configuration Files

import json
import os

class ConfigManager:
    def __init__(self, config_file='config.json'):
        self.config_file = config_file
        self.config = self.load_config()
    
    def load_config(self):
        if os.path.exists(self.config_file):
            with open(self.config_file, 'r', encoding='utf-8') as file:
                return json.load(file)
        return {}
    
    def save_config(self):
        with open(self.config_file, 'w', encoding='utf-8') as file:
            json.dump(self.config, file, indent=4, ensure_ascii=False)
    
    def get(self, key, default=None):
        return self.config.get(key, default)
    
    def set(self, key, value):
        self.config[key] = value
        self.save_config()

# Usage
config = ConfigManager()
config.set('theme', 'dark')
config.set('language', 'ru')

Best Practices and Recommendations

Security

  • Do not use eval() to parse JSON.
  • Validate data after parsing.
  • Limit the size of the input data.

Performance

  • Use ujson or orjson for performance-critical applications.
  • For large files, use streaming processing.
  • Cache parsing results if necessary.

Encoding and Localization

  • Always specify encoding='utf-8' when working with files.
  • Use ensure_ascii=False for correct display of Unicode.
  • Consider encoding when working with APIs.

Debugging and Testing

Useful Tools

For debugging JSON data, use:

import json
import pprint

def debug_json(data):
    print("=== JSON Debug ===")
    print(f"Type: {type(data)}")
    print(f"Size: {len(data) if hasattr(data, '__len__') else 'N/A'}")
    print("Content:")
    pprint.pprint(data)
    print("JSON string:")
    print(json.dumps(data, indent=2, ensure_ascii=False))

Testing JSON Operations

import unittest
import json

class TestJSONOperations(unittest.TestCase):
    def test_parse_valid_json(self):
        json_str = '{"name": "test", "value": 42}'
        result = json.loads(json_str)
        self.assertEqual(result['name'], 'test')
        self.assertEqual(result['value'], 42)
    
    def test_parse_invalid_json(self):
        invalid_json = '{"name": "test"'
        with self.assertRaises(json.JSONDecodeError):
            json.loads(invalid_json)

Working with JSON in Python is a fundamental skill for a modern developer. The built-in json module covers most needs, but for specific tasks it is worth considering alternative libraries. Proper use of these tools will allow you to effectively integrate Python applications with web services, APIs, and data exchange systems.

News