Pickledb - light database

онлайн тренажер по питону
Online Python Trainer for Beginners

Learn Python easily without overwhelming theory. Solve practical tasks with automatic checking, get hints in Russian, and write code directly in your browser — no installation required.

Start Course

Introduction

PickleDB is a lightweight key‑value database written in pure Python that requires no additional servers or dependencies. This library provides a simple solution for storing small amounts of data in JSON format, making it an ideal choice for rapid prototypes, CLI utilities, configuration files, and micro‑projects.

The main advantage of PickleDB lies in its ease of use while offering enough functionality to solve a wide range of data‑storage tasks. The library supports various data types: strings, numbers, lists, dictionaries, and boolean values.

What Is PickleDB and Where Is It Used?

PickleDB is a NoSQL database that stores data as key‑value pairs in a JSON file. It was created as an alternative for cases where full‑featured DBMSs such as PostgreSQL or MySQL would be overkill.

Typical use cases for PickleDB include:

  • Storing user settings and configuration
  • Caching intermediate computation results
  • Saving authorization tokens and API keys
  • Maintaining simple logs and counters
  • Creating fast prototypes with data persistence
  • Developing CLI applications with local storage

Installation and Initial Setup

Install PickleDB with the standard Python package manager:

pip install pickledb

After installation the library is ready to use without any extra configuration:

import pickledb

# Create or load a database
db = pickledb.load('database.db', auto_dump=True)

The auto_dump=True option automatically writes all changes to the file, guaranteeing data safety even if the program terminates unexpectedly.

Architecture and Operating Principles

Data Storage Structure

PickleDB uses JSON as its storage format, providing human‑readable data that can be transferred across systems. The database file is a regular JSON document with a structure similar to:

{
  "simple_key": "string value",
  "numeric_key": 42,
  "boolean_key": true,
  "list_key": ["item1", "item2", "item3"],
  "dict_key": {
    "nested_field": "nested value",
    "nested_number": 100
  }
}

Saving Mechanism

When working with PickleDB, all data is loaded into memory at initialization. Modifications are applied to the in‑memory object and then, depending on the configuration, persisted to disk:

  • With auto_dump=True – the database writes after every write operation
  • With auto_dump=False – you must call dump() manually

Complete Reference of PickleDB Methods and Functions

Core Data Manipulation Methods

Method Description Example
set(key, value) Saves a value under the given key db.set('name', 'John')
get(key) Retrieves the value for a key name = db.get('name')
exists(key) Checks whether a key exists if db.exists('name'):
rem(key) Removes a key and its value db.rem('name')
getall() Returns a list of all keys keys = db.getall()
totalkeys() Returns the total number of keys count = db.totalkeys()
append(key, value) Appends a string to an existing value db.append('text', ' addition')

Numeric Methods

Method Description Example
incr(key, by=1) Increments a numeric value db.incr('counter')
decr(key, by=1) Decrements a numeric value db.decr('counter', 5)

List Methods

Method Description Example
lcreate(name) Creates a new list db.lcreate('items')
ladd(name, value) Adds an element to a list db.ladd('items', 'new_item')
lget(name, pos) Retrieves an element by position item = db.lget('items', 0)
lgetall(name) Returns all elements of a list items = db.lgetall('items')
lremlist(name) Deletes the entire list db.lremlist('items')
lremvalue(name, value) Removes the first occurrence of a value db.lremvalue('items', 'element')
lpop(name, pos) Removes and returns an element by position item = db.lpop('items', -1)
llen(name) Returns the length of a list length = db.llen('items')
lextend(name, seq) Extends a list with a sequence db.lextend('items', [1, 2, 3])
lexists(name, value) Checks whether a value exists in a list exists = db.lexists('items', 'element')

Dictionary Methods

Method Description Example
dcreate(name) Creates a new dictionary db.dcreate('user_data')
dadd(name, pair) Adds a key‑value pair to a dictionary db.dadd('user_data', ('age', 25))
dget(name, key) Retrieves a value by key from a dictionary age = db.dget('user_data', 'age')
dgetall(name) Returns the entire dictionary user = db.dgetall('user_data')
drem(name) Deletes the whole dictionary db.drem('user_data')
dpop(name, key) Removes and returns a value by key value = db.dpop('user_data', 'age')
dkeys(name) Returns all keys of a dictionary keys = db.dkeys('user_data')
dvals(name) Returns all values of a dictionary values = db.dvals('user_data')
dexists(name, key) Checks whether a key exists in a dictionary exists = db.dexists('user_data', 'age')
dmerge(name1, name2) Merges two dictionaries db.dmerge('dict1', 'dict2')

Utility Methods

Method Description Example
dump() Forcefully writes data to the file db.dump()
load(location, auto_dump) Loads a database from a file db = pickledb.load('db.json', True)
deldb() Deletes all data from the database db.deldb()

Working with Different Data Types

Simple Data Types

PickleDB supports all core Python data types:

import pickledb

db = pickledb.load('example.db', auto_dump=True)

# Strings
db.set('username', 'admin')
db.set('email', 'user@example.com')

# Numbers
db.set('user_id', 12345)
db.set('balance', 1000.50)

# Booleans
db.set('is_active', True)
db.set('email_verified', False)

# Retrieval
username = db.get('username')          # 'admin'
balance = db.get('balance')            # 1000.5
is_active = db.get('is_active')       # True

List Operations

PickleDB offers a rich API for list handling:

# Create and populate a list
db.lcreate('shopping_list')
db.ladd('shopping_list', 'milk')
db.ladd('shopping_list', 'bread')
db.ladd('shopping_list', 'eggs')

# Retrieve items
all_items = db.lgetall('shopping_list')   # ['milk', 'bread', 'eggs']
first_item = db.lget('shopping_list', 0)  # 'milk'
last_item = db.lget('shopping_list', -1)  # 'eggs'

# Check existence
has_milk = db.lexists('shopping_list', 'milk')  # True

# Remove items
db.lremvalue('shopping_list', 'bread')          # removes 'bread'
removed_item = db.lpop('shopping_list', 0)      # removes and returns first element

# Extend list
db.lextend('shopping_list', ['butter', 'cheese', 'yogurt'])

Dictionary Operations

For more complex structures PickleDB provides dictionary methods:

# Create a user profile
db.dcreate('user_profile')
db.dadd('user_profile', ('name', 'Anna Petrova'))
db.dadd('user_profile', ('age', 28))
db.dadd('user_profile', ('city', 'Moscow'))
db.dadd('user_profile', ('occupation', 'Developer'))

# Retrieve data
user_name = db.dget('user_profile', 'name')      # 'Anna Petrova'
full_profile = db.dgetall('user_profile')       # complete dict

# Check field existence
has_phone = db.dexists('user_profile', 'phone') # False

# Keys and values
profile_keys = db.dkeys('user_profile')         # ['name', 'age', 'city', 'occupation']
profile_values = db.dvals('user_profile')       # ['Anna Petrova', 28, 'Moscow', 'Developer']

# Delete a field
old_age = db.dpop('user_profile', 'age')        # removes and returns age

Data Saving Modes

Automatic Saving

With auto_dump=True every change is instantly persisted to disk:

db = pickledb.load('auto_save.db', auto_dump=True)
db.set('last_update', '2024-01-15')  # automatically saved

Manual Saving

Setting auto_dump=False lets you control when data is written:

db = pickledb.load('manual_save.db', auto_dump=False)

# Perform many operations
db.set('counter', 1)
db.incr('counter')
db.lcreate('items')
db.ladd('items', 'item1')

# Flush all changes with a single call
db.dump()

This approach is useful for bulk writes because it reduces disk I/O.

In‑Memory Only Mode

For transient data you can keep the database purely in memory:

db = pickledb.load('temp.db', auto_dump=False)
# Work with data but never call dump()
# All data is lost when the program exits

Integration with Web Frameworks

Using with Flask

PickleDB integrates smoothly with Flask for storing configurations and user data:

from flask import Flask, request, jsonify
import pickledb

app = Flask(__name__)
db = pickledb.load('flask_app.db', auto_dump=True)

@app.route('/user/<user_id>', methods=['GET'])
def get_user(user_id):
    if db.exists(f'user_{user_id}'):
        user_data = db.get(f'user_{user_id}')
        return jsonify(user_data)
    return jsonify({'error': 'User not found'}), 404

@app.route('/user/<user_id>', methods=['POST'])
def create_user(user_id):
    user_data = request.get_json()
    db.set(f'user_{user_id}', user_data)
    return jsonify({'status': 'User created'})

@app.route('/stats')
def get_stats():
    return jsonify({
        'total_users': db.totalkeys(),
        'all_keys': db.getall()
    })

if __name__ == '__main__':
    app.run(debug=True)

Using with FastAPI

FastAPI can be paired with PickleDB to build fast APIs:

from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import pickledb

app = FastAPI()
db = pickledb.load('fastapi_app.db', auto_dump=True)

class UserModel(BaseModel):
    name: str
    email: str
    age: int

@app.get('/users/{user_id}')
async def get_user(user_id: str):
    if not db.exists(f'user_{user_id}'):
        raise HTTPException(status_code=404, detail="User not found")
    return db.get(f'user_{user_id}')

@app.post('/users/{user_id}')
async def create_user(user_id: str, user: UserModel):
    db.set(f'user_{user_id}', user.dict())
    return {"message": "User created successfully"}

@app.delete('/users/{user_id}')
async def delete_user(user_id: str):
    if not db.exists(f'user_{user_id}'):
        raise HTTPException(status_code=404, detail="User not found")
    db.rem(f'user_{user_id}')
    return {"message": "User deleted successfully"}

Practical Usage Examples

Application Configuration System

import pickledb

class AppConfig:
    def __init__(self, config_file='app_config.db'):
        self.db = pickledb.load(config_file, auto_dump=True)
        self._set_defaults()
    
    def _set_defaults(self):
        defaults = {
            'theme': 'light',
            'language': 'en',
            'auto_save': True,
            'log_level': 'INFO',
            'max_connections': 100
        }
        for key, value in defaults.items():
            if not self.db.exists(key):
                self.db.set(key, value)
    
    def get(self, key):
        return self.db.get(key)
    
    def set(self, key, value):
        self.db.set(key, value)
    
    def reset_to_defaults(self):
        self.db.deldb()
        self._set_defaults()

# Usage
config = AppConfig()
print(f"Current theme: {config.get('theme')}")
config.set('theme', 'dark')

API Response Caching

import pickledb
import requests
import time
from datetime import datetime, timedelta

class APICache:
    def __init__(self, cache_file='api_cache.db', ttl_minutes=30):
        self.db = pickledb.load(cache_file, auto_dump=True)
        self.ttl_minutes = ttl_minutes
    
    def _is_expired(self, timestamp):
        cache_time = datetime.fromisoformat(timestamp)
        return datetime.now() - cache_time > timedelta(minutes=self.ttl_minutes)
    
    def get_cached_response(self, url):
        cache_key = f"cache_{url}"
        timestamp_key = f"time_{url}"
        if self.db.exists(cache_key) and self.db.exists(timestamp_key):
            if not self._is_expired(self.db.get(timestamp_key)):
                return self.db.get(cache_key)
        return None
    
    def cache_response(self, url, response_data):
        cache_key = f"cache_{url}"
        timestamp_key = f"time_{url}"
        self.db.set(cache_key, response_data)
        self.db.set(timestamp_key, datetime.now().isoformat())
    
    def clear_expired(self):
        for key in self.db.getall():
            if key.startswith('time_') and self._is_expired(self.db.get(key)):
                url = key.replace('time_', '')
                self.db.rem(f'cache_{url}')
                self.db.rem(key)

# Usage
cache = APICache()

def get_weather_data(city):
    cached = cache.get_cached_response(f'weather_{city}')
    if cached:
        return cached
    # Simulate API call
    api_data = {'city': city, 'temp': 20, 'humidity': 60}
    cache.cache_response(f'weather_{city}', api_data)
    return api_data

CLI Utility with History Persistence

import pickledb
import argparse
from datetime import datetime

class TaskManager:
    def __init__(self):
        self.db = pickledb.load('tasks.db', auto_dump=True)
        if not self.db.exists('tasks'):
            self.db.lcreate('tasks')
        if not self.db.exists('completed_tasks'):
            self.db.lcreate('completed_tasks')
    
    def add_task(self, description):
        task = {
            'id': self.db.llen('tasks') + 1,
            'description': description,
            'created_at': datetime.now().isoformat(),
            'completed': False
        }
        self.db.ladd('tasks', task)
        print(f"Task added: {description}")
    
    def list_tasks(self):
        tasks = self.db.lgetall('tasks')
        if not tasks:
            print("No active tasks")
            return
        print("Active tasks:")
        for task in tasks:
            print(f"  [{task['id']}] {task['description']}")
    
    def complete_task(self, task_id):
        tasks = self.db.lgetall('tasks')
        for i, task in enumerate(tasks):
            if task['id'] == task_id:
                task['completed'] = True
                task['completed_at'] = datetime.now().isoformat()
                self.db.ladd('completed_tasks', task)
                self.db.lpop('tasks', i)
                print(f"Task {task_id} completed!")
                return
        print(f"Task with ID {task_id} not found")
    
    def show_stats(self):
        active = self.db.llen('tasks')
        completed = self.db.llen('completed_tasks')
        print(f"Active tasks: {active}")
        print(f"Completed tasks: {completed}")

def main():
    parser = argparse.ArgumentParser(description='Task manager')
    parser.add_argument('command', choices=['add', 'list', 'complete', 'stats'])
    parser.add_argument('--description', help='Task description')
    parser.add_argument('--id', type=int, help='Task ID')
    
    args = parser.parse_args()
    manager = TaskManager()
    
    if args.command == 'add' and args.description:
        manager.add_task(args.description)
    elif args.command == 'list':
        manager.list_tasks()
    elif args.command == 'complete' and args.id:
        manager.complete_task(args.id)
    elif args.command == 'stats':
        manager.show_stats()

if __name__ == '__main__':
    main()

Performance Optimization

Batch Operations

For high‑throughput writes, disable automatic dumping and flush once after the batch:

db = pickledb.load('bulk_data.db', auto_dump=False)

# Perform many writes
for i in range(1000):
    db.set(f'item_{i}', {'value': i, 'processed': False})

# Single dump at the end
db.dump()

Managing Database Size

Regularly prune outdated records to keep the database fast:

def cleanup_old_data(db, days_old=30):
    from datetime import datetime, timedelta
    cutoff = datetime.now() - timedelta(days=days_old)
    for key in db.getall():
        if key.startswith('temp_'):
            data = db.get(key)
            if isinstance(data, dict) and 'created_at' in data:
                created = datetime.fromisoformat(data['created_at'])
                if created < cutoff:
                    db.rem(key)

Error Handling and Debugging

Data Integrity Checks

def validate_database(db):
    try:
        keys = db.getall()
        print(f"Total keys: {len(keys)}")
        for key in keys:
            value = db.get(key)
            if value is None:
                print(f"Warning: key '{key}' has None value")
        return True
    except Exception as e:
        print(f"Validation error: {e}")
        return False

Creating Backups

import shutil
from datetime import datetime

def backup_database(db_file):
    timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
    backup_file = f"{db_file}.backup_{timestamp}"
    try:
        shutil.copy2(db_file, backup_file)
        print(f"Backup created: {backup_file}")
        return backup_file
    except Exception as e:
        print(f"Backup error: {e}")
        return None

Comparison with Alternatives

PickleDB vs TinyDB vs SQLite

Feature PickleDB TinyDB SQLite
Library size Very small Medium Built‑in
Query syntax Simple API Query API SQL
Performance High for small datasets Medium High
Transaction support No Limited Full
Indexing No Yes Full
Best suited for Configs, cache Mid‑size apps Relational data
Maximum size Up to ~100 MB Up to ~1 GB Practically unlimited

Testing Applications with PickleDB

Using Temporary Databases

import unittest
import tempfile
import os
import pickledb

class TestPickleDBApp(unittest.TestCase):
    def setUp(self):
        # Create a temporary file for tests
        self.temp_file = tempfile.NamedTemporaryFile(delete=False)
        self.temp_file.close()
        self.db = pickledb.load(self.temp_file.name, auto_dump=True)
    
    def tearDown(self):
        # Remove the temporary file after tests
        os.unlink(self.temp_file.name)
    
    def test_basic_operations(self):
        self.db.set('test_key', 'test_value')
        self.assertEqual(self.db.get('test_key'), 'test_value')
        self.assertTrue(self.db.exists('test_key'))
        self.db.rem('test_key')
        self.assertFalse(self.db.exists('test_key'))
    
    def test_list_operations(self):
        self.db.lcreate('test_list')
        self.db.ladd('test_list', 'item1')
        self.db.ladd('test_list', 'item2')
        self.assertEqual(self.db.llen('test_list'), 2)
        self.assertEqual(self.db.lget('test_list', 0), 'item1')
        self.assertTrue(self.db.lexists('test_list', 'item2'))

if __name__ == '__main__':
    unittest.main()

Frequently Asked Questions

How to handle a corrupted JSON file?

If the JSON file becomes damaged, catch the exception and recreate the database:

import pickledb
import json

def safe_load_db(filename):
    try:
        return pickledb.load(filename, auto_dump=True)
    except (json.JSONDecodeError, FileNotFoundError):
        print(f"{filename} is corrupted or missing. Creating a new database.")
        return pickledb.load(filename, auto_dump=True)

Can PickleDB be used in multithreaded applications?

PickleDB is not thread‑safe. Use a lock to synchronize access:

import threading
import pickledb

class ThreadSafePickleDB:
    def __init__(self, filename):
        self.db = pickledb.load(filename, auto_dump=True)
        self.lock = threading.Lock()
    
    def set(self, key, value):
        with self.lock:
            return self.db.set(key, value)
    
    def get(self, key):
        with self.lock:
            return self.db.get(key)

How to migrate data between application versions?

def migrate_database(db, current_version, target_version):
    if current_version < 2 <= target_version:
        # Migration from v1 to v2
        for key in db.getall():
            if key.startswith('old_'):
                value = db.get(key)
                new_key = key.replace('old_', 'new_')
                db.set(new_key, value)
                db.rem(key)
        db.set('db_version', 2)

How to enforce a size limit on the database?

def enforce_size_limit(db, max_keys=1000):
    if db.totalkeys() > max_keys:
        keys = sorted(db.getall())[:db.totalkeys() - max_keys]
        for key in keys:
            db.rem(key)

How to export data to other formats?

import csv
import json

def export_to_json(db, output_file):
    data = {key: db.get(key) for key in db.getall()}
    with open(output_file, 'w', encoding='utf-8') as f:
        json.dump(data, f, ensure_ascii=False, indent=2)

def export_to_csv(db, output_file):
    with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['Key', 'Value'])
        for key in db.getall():
            value = db.get(key)
            if isinstance(value, (dict, list)):
                value = json.dumps(value, ensure_ascii=False)
            writer.writerow([key, value])

Best Practices

Key Naming Conventions

Use prefixes to organize data logically:

# Bad
db.set('user1', {'name': 'John'})
db.set('config1', {'theme': 'dark'})

# Good
db.set('user:1:profile', {'name': 'John'})
db.set('user:1:settings', {'theme': 'dark'})
db.set('config:app:theme', 'dark')
db.set('cache:weather:moscow', {'temp': 20})

Data Validation

Always validate data before saving:

def save_user_profile(db, user_id, profile):
    required = ['name', 'email']
    if not all(field in profile for field in required):
        raise ValueError("Missing required fields")
    if '@' not in profile['email']:
        raise ValueError("Invalid email address")
    db.set(f'user:{user_id}:profile', profile)

Performance Monitoring

import time
import functools

def timing_decorator(func):
    @functools.wraps(func)
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        print(f"{func.__name__} executed in {end - start:.4f}s")
        return result
    return wrapper

@timing_decorator
def bulk_insert(db, data):
    for key, value in data.items():
        db.set(key, value)

Advantages and Limitations of PickleDB

Advantages

Ease of use: Minimal API, no SQL or complex concepts to learn.

Zero dependencies: Pure Python, no extra libraries required.

Human‑readable format: JSON files can be inspected and edited with any text editor.

Rapid development: Ideal for prototypes and quick solutions.

Built‑in persistence: Automatic data saving without extra setup.

Limitations

Scalability: Not suitable for large datasets (recommended limit ~100 MB).

No indexing: Value‑based searches require full scans.

No transactions: Lacks ACID guarantees.

File locking: Concurrent access from multiple processes can cause conflicts.

In‑memory load: The entire database is loaded into RAM at start‑up.

Conclusion

PickleDB is an excellent solution for scenarios that need simple, reliable data storage without the overhead of full‑featured DBMSs. It shines in configuration files, caching, small web apps, CLI tools, and rapid prototyping.

Typical use cases include fast prototype development, user‑settings storage, lightweight APIs for small projects, caching computation results, and local databases for desktop applications.

When choosing PickleDB, understand its constraints and apply it only where simplicity and modest data volumes are acceptable. For more demanding requirements, consider alternatives such as SQLite, TinyDB, or a full relational/NoSQL database.

Thanks to its minimal API, solid documentation, and active community, PickleDB remains a popular choice among Python developers who need to add persistent storage quickly and effortlessly.

News