Introduction
PickleDB is a lightweight key‑value database written in pure Python that requires no additional servers or dependencies. This library provides a simple solution for storing small amounts of data in JSON format, making it an ideal choice for rapid prototypes, CLI utilities, configuration files, and micro‑projects.
The main advantage of PickleDB lies in its ease of use while offering enough functionality to solve a wide range of data‑storage tasks. The library supports various data types: strings, numbers, lists, dictionaries, and boolean values.
What Is PickleDB and Where Is It Used?
PickleDB is a NoSQL database that stores data as key‑value pairs in a JSON file. It was created as an alternative for cases where full‑featured DBMSs such as PostgreSQL or MySQL would be overkill.
Typical use cases for PickleDB include:
- Storing user settings and configuration
- Caching intermediate computation results
- Saving authorization tokens and API keys
- Maintaining simple logs and counters
- Creating fast prototypes with data persistence
- Developing CLI applications with local storage
Installation and Initial Setup
Install PickleDB with the standard Python package manager:
pip install pickledb
After installation the library is ready to use without any extra configuration:
import pickledb
# Create or load a database
db = pickledb.load('database.db', auto_dump=True)
The auto_dump=True option automatically writes all changes to the file, guaranteeing data safety even if the program terminates unexpectedly.
Architecture and Operating Principles
Data Storage Structure
PickleDB uses JSON as its storage format, providing human‑readable data that can be transferred across systems. The database file is a regular JSON document with a structure similar to:
{
"simple_key": "string value",
"numeric_key": 42,
"boolean_key": true,
"list_key": ["item1", "item2", "item3"],
"dict_key": {
"nested_field": "nested value",
"nested_number": 100
}
}
Saving Mechanism
When working with PickleDB, all data is loaded into memory at initialization. Modifications are applied to the in‑memory object and then, depending on the configuration, persisted to disk:
- With
auto_dump=True– the database writes after every write operation - With
auto_dump=False– you must calldump()manually
Complete Reference of PickleDB Methods and Functions
Core Data Manipulation Methods
| Method | Description | Example |
|---|---|---|
set(key, value) |
Saves a value under the given key | db.set('name', 'John') |
get(key) |
Retrieves the value for a key | name = db.get('name') |
exists(key) |
Checks whether a key exists | if db.exists('name'): |
rem(key) |
Removes a key and its value | db.rem('name') |
getall() |
Returns a list of all keys | keys = db.getall() |
totalkeys() |
Returns the total number of keys | count = db.totalkeys() |
append(key, value) |
Appends a string to an existing value | db.append('text', ' addition') |
Numeric Methods
| Method | Description | Example |
|---|---|---|
incr(key, by=1) |
Increments a numeric value | db.incr('counter') |
decr(key, by=1) |
Decrements a numeric value | db.decr('counter', 5) |
List Methods
| Method | Description | Example |
|---|---|---|
lcreate(name) |
Creates a new list | db.lcreate('items') |
ladd(name, value) |
Adds an element to a list | db.ladd('items', 'new_item') |
lget(name, pos) |
Retrieves an element by position | item = db.lget('items', 0) |
lgetall(name) |
Returns all elements of a list | items = db.lgetall('items') |
lremlist(name) |
Deletes the entire list | db.lremlist('items') |
lremvalue(name, value) |
Removes the first occurrence of a value | db.lremvalue('items', 'element') |
lpop(name, pos) |
Removes and returns an element by position | item = db.lpop('items', -1) |
llen(name) |
Returns the length of a list | length = db.llen('items') |
lextend(name, seq) |
Extends a list with a sequence | db.lextend('items', [1, 2, 3]) |
lexists(name, value) |
Checks whether a value exists in a list | exists = db.lexists('items', 'element') |
Dictionary Methods
| Method | Description | Example |
|---|---|---|
dcreate(name) |
Creates a new dictionary | db.dcreate('user_data') |
dadd(name, pair) |
Adds a key‑value pair to a dictionary | db.dadd('user_data', ('age', 25)) |
dget(name, key) |
Retrieves a value by key from a dictionary | age = db.dget('user_data', 'age') |
dgetall(name) |
Returns the entire dictionary | user = db.dgetall('user_data') |
drem(name) |
Deletes the whole dictionary | db.drem('user_data') |
dpop(name, key) |
Removes and returns a value by key | value = db.dpop('user_data', 'age') |
dkeys(name) |
Returns all keys of a dictionary | keys = db.dkeys('user_data') |
dvals(name) |
Returns all values of a dictionary | values = db.dvals('user_data') |
dexists(name, key) |
Checks whether a key exists in a dictionary | exists = db.dexists('user_data', 'age') |
dmerge(name1, name2) |
Merges two dictionaries | db.dmerge('dict1', 'dict2') |
Utility Methods
| Method | Description | Example |
|---|---|---|
dump() |
Forcefully writes data to the file | db.dump() |
load(location, auto_dump) |
Loads a database from a file | db = pickledb.load('db.json', True) |
deldb() |
Deletes all data from the database | db.deldb() |
Working with Different Data Types
Simple Data Types
PickleDB supports all core Python data types:
import pickledb
db = pickledb.load('example.db', auto_dump=True)
# Strings
db.set('username', 'admin')
db.set('email', 'user@example.com')
# Numbers
db.set('user_id', 12345)
db.set('balance', 1000.50)
# Booleans
db.set('is_active', True)
db.set('email_verified', False)
# Retrieval
username = db.get('username') # 'admin'
balance = db.get('balance') # 1000.5
is_active = db.get('is_active') # True
List Operations
PickleDB offers a rich API for list handling:
# Create and populate a list
db.lcreate('shopping_list')
db.ladd('shopping_list', 'milk')
db.ladd('shopping_list', 'bread')
db.ladd('shopping_list', 'eggs')
# Retrieve items
all_items = db.lgetall('shopping_list') # ['milk', 'bread', 'eggs']
first_item = db.lget('shopping_list', 0) # 'milk'
last_item = db.lget('shopping_list', -1) # 'eggs'
# Check existence
has_milk = db.lexists('shopping_list', 'milk') # True
# Remove items
db.lremvalue('shopping_list', 'bread') # removes 'bread'
removed_item = db.lpop('shopping_list', 0) # removes and returns first element
# Extend list
db.lextend('shopping_list', ['butter', 'cheese', 'yogurt'])
Dictionary Operations
For more complex structures PickleDB provides dictionary methods:
# Create a user profile
db.dcreate('user_profile')
db.dadd('user_profile', ('name', 'Anna Petrova'))
db.dadd('user_profile', ('age', 28))
db.dadd('user_profile', ('city', 'Moscow'))
db.dadd('user_profile', ('occupation', 'Developer'))
# Retrieve data
user_name = db.dget('user_profile', 'name') # 'Anna Petrova'
full_profile = db.dgetall('user_profile') # complete dict
# Check field existence
has_phone = db.dexists('user_profile', 'phone') # False
# Keys and values
profile_keys = db.dkeys('user_profile') # ['name', 'age', 'city', 'occupation']
profile_values = db.dvals('user_profile') # ['Anna Petrova', 28, 'Moscow', 'Developer']
# Delete a field
old_age = db.dpop('user_profile', 'age') # removes and returns age
Data Saving Modes
Automatic Saving
With auto_dump=True every change is instantly persisted to disk:
db = pickledb.load('auto_save.db', auto_dump=True)
db.set('last_update', '2024-01-15') # automatically saved
Manual Saving
Setting auto_dump=False lets you control when data is written:
db = pickledb.load('manual_save.db', auto_dump=False)
# Perform many operations
db.set('counter', 1)
db.incr('counter')
db.lcreate('items')
db.ladd('items', 'item1')
# Flush all changes with a single call
db.dump()
This approach is useful for bulk writes because it reduces disk I/O.
In‑Memory Only Mode
For transient data you can keep the database purely in memory:
db = pickledb.load('temp.db', auto_dump=False)
# Work with data but never call dump()
# All data is lost when the program exits
Integration with Web Frameworks
Using with Flask
PickleDB integrates smoothly with Flask for storing configurations and user data:
from flask import Flask, request, jsonify
import pickledb
app = Flask(__name__)
db = pickledb.load('flask_app.db', auto_dump=True)
@app.route('/user/<user_id>', methods=['GET'])
def get_user(user_id):
if db.exists(f'user_{user_id}'):
user_data = db.get(f'user_{user_id}')
return jsonify(user_data)
return jsonify({'error': 'User not found'}), 404
@app.route('/user/<user_id>', methods=['POST'])
def create_user(user_id):
user_data = request.get_json()
db.set(f'user_{user_id}', user_data)
return jsonify({'status': 'User created'})
@app.route('/stats')
def get_stats():
return jsonify({
'total_users': db.totalkeys(),
'all_keys': db.getall()
})
if __name__ == '__main__':
app.run(debug=True)
Using with FastAPI
FastAPI can be paired with PickleDB to build fast APIs:
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import pickledb
app = FastAPI()
db = pickledb.load('fastapi_app.db', auto_dump=True)
class UserModel(BaseModel):
name: str
email: str
age: int
@app.get('/users/{user_id}')
async def get_user(user_id: str):
if not db.exists(f'user_{user_id}'):
raise HTTPException(status_code=404, detail="User not found")
return db.get(f'user_{user_id}')
@app.post('/users/{user_id}')
async def create_user(user_id: str, user: UserModel):
db.set(f'user_{user_id}', user.dict())
return {"message": "User created successfully"}
@app.delete('/users/{user_id}')
async def delete_user(user_id: str):
if not db.exists(f'user_{user_id}'):
raise HTTPException(status_code=404, detail="User not found")
db.rem(f'user_{user_id}')
return {"message": "User deleted successfully"}
Practical Usage Examples
Application Configuration System
import pickledb
class AppConfig:
def __init__(self, config_file='app_config.db'):
self.db = pickledb.load(config_file, auto_dump=True)
self._set_defaults()
def _set_defaults(self):
defaults = {
'theme': 'light',
'language': 'en',
'auto_save': True,
'log_level': 'INFO',
'max_connections': 100
}
for key, value in defaults.items():
if not self.db.exists(key):
self.db.set(key, value)
def get(self, key):
return self.db.get(key)
def set(self, key, value):
self.db.set(key, value)
def reset_to_defaults(self):
self.db.deldb()
self._set_defaults()
# Usage
config = AppConfig()
print(f"Current theme: {config.get('theme')}")
config.set('theme', 'dark')
API Response Caching
import pickledb
import requests
import time
from datetime import datetime, timedelta
class APICache:
def __init__(self, cache_file='api_cache.db', ttl_minutes=30):
self.db = pickledb.load(cache_file, auto_dump=True)
self.ttl_minutes = ttl_minutes
def _is_expired(self, timestamp):
cache_time = datetime.fromisoformat(timestamp)
return datetime.now() - cache_time > timedelta(minutes=self.ttl_minutes)
def get_cached_response(self, url):
cache_key = f"cache_{url}"
timestamp_key = f"time_{url}"
if self.db.exists(cache_key) and self.db.exists(timestamp_key):
if not self._is_expired(self.db.get(timestamp_key)):
return self.db.get(cache_key)
return None
def cache_response(self, url, response_data):
cache_key = f"cache_{url}"
timestamp_key = f"time_{url}"
self.db.set(cache_key, response_data)
self.db.set(timestamp_key, datetime.now().isoformat())
def clear_expired(self):
for key in self.db.getall():
if key.startswith('time_') and self._is_expired(self.db.get(key)):
url = key.replace('time_', '')
self.db.rem(f'cache_{url}')
self.db.rem(key)
# Usage
cache = APICache()
def get_weather_data(city):
cached = cache.get_cached_response(f'weather_{city}')
if cached:
return cached
# Simulate API call
api_data = {'city': city, 'temp': 20, 'humidity': 60}
cache.cache_response(f'weather_{city}', api_data)
return api_data
CLI Utility with History Persistence
import pickledb
import argparse
from datetime import datetime
class TaskManager:
def __init__(self):
self.db = pickledb.load('tasks.db', auto_dump=True)
if not self.db.exists('tasks'):
self.db.lcreate('tasks')
if not self.db.exists('completed_tasks'):
self.db.lcreate('completed_tasks')
def add_task(self, description):
task = {
'id': self.db.llen('tasks') + 1,
'description': description,
'created_at': datetime.now().isoformat(),
'completed': False
}
self.db.ladd('tasks', task)
print(f"Task added: {description}")
def list_tasks(self):
tasks = self.db.lgetall('tasks')
if not tasks:
print("No active tasks")
return
print("Active tasks:")
for task in tasks:
print(f" [{task['id']}] {task['description']}")
def complete_task(self, task_id):
tasks = self.db.lgetall('tasks')
for i, task in enumerate(tasks):
if task['id'] == task_id:
task['completed'] = True
task['completed_at'] = datetime.now().isoformat()
self.db.ladd('completed_tasks', task)
self.db.lpop('tasks', i)
print(f"Task {task_id} completed!")
return
print(f"Task with ID {task_id} not found")
def show_stats(self):
active = self.db.llen('tasks')
completed = self.db.llen('completed_tasks')
print(f"Active tasks: {active}")
print(f"Completed tasks: {completed}")
def main():
parser = argparse.ArgumentParser(description='Task manager')
parser.add_argument('command', choices=['add', 'list', 'complete', 'stats'])
parser.add_argument('--description', help='Task description')
parser.add_argument('--id', type=int, help='Task ID')
args = parser.parse_args()
manager = TaskManager()
if args.command == 'add' and args.description:
manager.add_task(args.description)
elif args.command == 'list':
manager.list_tasks()
elif args.command == 'complete' and args.id:
manager.complete_task(args.id)
elif args.command == 'stats':
manager.show_stats()
if __name__ == '__main__':
main()
Performance Optimization
Batch Operations
For high‑throughput writes, disable automatic dumping and flush once after the batch:
db = pickledb.load('bulk_data.db', auto_dump=False)
# Perform many writes
for i in range(1000):
db.set(f'item_{i}', {'value': i, 'processed': False})
# Single dump at the end
db.dump()
Managing Database Size
Regularly prune outdated records to keep the database fast:
def cleanup_old_data(db, days_old=30):
from datetime import datetime, timedelta
cutoff = datetime.now() - timedelta(days=days_old)
for key in db.getall():
if key.startswith('temp_'):
data = db.get(key)
if isinstance(data, dict) and 'created_at' in data:
created = datetime.fromisoformat(data['created_at'])
if created < cutoff:
db.rem(key)
Error Handling and Debugging
Data Integrity Checks
def validate_database(db):
try:
keys = db.getall()
print(f"Total keys: {len(keys)}")
for key in keys:
value = db.get(key)
if value is None:
print(f"Warning: key '{key}' has None value")
return True
except Exception as e:
print(f"Validation error: {e}")
return False
Creating Backups
import shutil
from datetime import datetime
def backup_database(db_file):
timestamp = datetime.now().strftime('%Y%m%d_%H%M%S')
backup_file = f"{db_file}.backup_{timestamp}"
try:
shutil.copy2(db_file, backup_file)
print(f"Backup created: {backup_file}")
return backup_file
except Exception as e:
print(f"Backup error: {e}")
return None
Comparison with Alternatives
PickleDB vs TinyDB vs SQLite
| Feature | PickleDB | TinyDB | SQLite |
|---|---|---|---|
| Library size | Very small | Medium | Built‑in |
| Query syntax | Simple API | Query API | SQL |
| Performance | High for small datasets | Medium | High |
| Transaction support | No | Limited | Full |
| Indexing | No | Yes | Full |
| Best suited for | Configs, cache | Mid‑size apps | Relational data |
| Maximum size | Up to ~100 MB | Up to ~1 GB | Practically unlimited |
Testing Applications with PickleDB
Using Temporary Databases
import unittest
import tempfile
import os
import pickledb
class TestPickleDBApp(unittest.TestCase):
def setUp(self):
# Create a temporary file for tests
self.temp_file = tempfile.NamedTemporaryFile(delete=False)
self.temp_file.close()
self.db = pickledb.load(self.temp_file.name, auto_dump=True)
def tearDown(self):
# Remove the temporary file after tests
os.unlink(self.temp_file.name)
def test_basic_operations(self):
self.db.set('test_key', 'test_value')
self.assertEqual(self.db.get('test_key'), 'test_value')
self.assertTrue(self.db.exists('test_key'))
self.db.rem('test_key')
self.assertFalse(self.db.exists('test_key'))
def test_list_operations(self):
self.db.lcreate('test_list')
self.db.ladd('test_list', 'item1')
self.db.ladd('test_list', 'item2')
self.assertEqual(self.db.llen('test_list'), 2)
self.assertEqual(self.db.lget('test_list', 0), 'item1')
self.assertTrue(self.db.lexists('test_list', 'item2'))
if __name__ == '__main__':
unittest.main()
Frequently Asked Questions
How to handle a corrupted JSON file?
If the JSON file becomes damaged, catch the exception and recreate the database:
import pickledb
import json
def safe_load_db(filename):
try:
return pickledb.load(filename, auto_dump=True)
except (json.JSONDecodeError, FileNotFoundError):
print(f"{filename} is corrupted or missing. Creating a new database.")
return pickledb.load(filename, auto_dump=True)
Can PickleDB be used in multithreaded applications?
PickleDB is not thread‑safe. Use a lock to synchronize access:
import threading
import pickledb
class ThreadSafePickleDB:
def __init__(self, filename):
self.db = pickledb.load(filename, auto_dump=True)
self.lock = threading.Lock()
def set(self, key, value):
with self.lock:
return self.db.set(key, value)
def get(self, key):
with self.lock:
return self.db.get(key)
How to migrate data between application versions?
def migrate_database(db, current_version, target_version):
if current_version < 2 <= target_version:
# Migration from v1 to v2
for key in db.getall():
if key.startswith('old_'):
value = db.get(key)
new_key = key.replace('old_', 'new_')
db.set(new_key, value)
db.rem(key)
db.set('db_version', 2)
How to enforce a size limit on the database?
def enforce_size_limit(db, max_keys=1000):
if db.totalkeys() > max_keys:
keys = sorted(db.getall())[:db.totalkeys() - max_keys]
for key in keys:
db.rem(key)
How to export data to other formats?
import csv
import json
def export_to_json(db, output_file):
data = {key: db.get(key) for key in db.getall()}
with open(output_file, 'w', encoding='utf-8') as f:
json.dump(data, f, ensure_ascii=False, indent=2)
def export_to_csv(db, output_file):
with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
writer = csv.writer(csvfile)
writer.writerow(['Key', 'Value'])
for key in db.getall():
value = db.get(key)
if isinstance(value, (dict, list)):
value = json.dumps(value, ensure_ascii=False)
writer.writerow([key, value])
Best Practices
Key Naming Conventions
Use prefixes to organize data logically:
# Bad
db.set('user1', {'name': 'John'})
db.set('config1', {'theme': 'dark'})
# Good
db.set('user:1:profile', {'name': 'John'})
db.set('user:1:settings', {'theme': 'dark'})
db.set('config:app:theme', 'dark')
db.set('cache:weather:moscow', {'temp': 20})
Data Validation
Always validate data before saving:
def save_user_profile(db, user_id, profile):
required = ['name', 'email']
if not all(field in profile for field in required):
raise ValueError("Missing required fields")
if '@' not in profile['email']:
raise ValueError("Invalid email address")
db.set(f'user:{user_id}:profile', profile)
Performance Monitoring
import time
import functools
def timing_decorator(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
start = time.time()
result = func(*args, **kwargs)
end = time.time()
print(f"{func.__name__} executed in {end - start:.4f}s")
return result
return wrapper
@timing_decorator
def bulk_insert(db, data):
for key, value in data.items():
db.set(key, value)
Advantages and Limitations of PickleDB
Advantages
Ease of use: Minimal API, no SQL or complex concepts to learn.
Zero dependencies: Pure Python, no extra libraries required.
Human‑readable format: JSON files can be inspected and edited with any text editor.
Rapid development: Ideal for prototypes and quick solutions.
Built‑in persistence: Automatic data saving without extra setup.
Limitations
Scalability: Not suitable for large datasets (recommended limit ~100 MB).
No indexing: Value‑based searches require full scans.
No transactions: Lacks ACID guarantees.
File locking: Concurrent access from multiple processes can cause conflicts.
In‑memory load: The entire database is loaded into RAM at start‑up.
Conclusion
PickleDB is an excellent solution for scenarios that need simple, reliable data storage without the overhead of full‑featured DBMSs. It shines in configuration files, caching, small web apps, CLI tools, and rapid prototyping.
Typical use cases include fast prototype development, user‑settings storage, lightweight APIs for small projects, caching computation results, and local databases for desktop applications.
When choosing PickleDB, understand its constraints and apply it only where simplicity and modest data volumes are acceptable. For more demanding requirements, consider alternatives such as SQLite, TinyDB, or a full relational/NoSQL database.
Thanks to its minimal API, solid documentation, and active community, PickleDB remains a popular choice among Python developers who need to add persistent storage quickly and effortlessly.
The Future of AI in Mathematics and Everyday Life: How Intelligent Agents Are Already Changing the Game
Experts warned about the risks of fake charity with AI
In Russia, universal AI-agent for robots and industrial processes was developed