History of the library
Prophet was first presented in 2017 by researchers from the Facebook Core Data Science Team. The library was built to solve specific company tasks of forecasting various metrics: from user activity to infrastructure planning.
The primary developers were Sean Taylor and Ben Letham, who aimed to create a tool that combines statistical rigor with practical applicability. In 2018 an R version was released, and later the library was ported to Python.
Mathematical foundations and operating principles
Prophet additive model
Trend modeling
Prophet supports two types of trends:
Linear trend: [ g(t) = (k + a(t)^T \delta) \cdot t + (m + a(t)^T \gamma) ]
Logistic trend: [ g(t) = \frac{C(t)}{1 + \exp(-(k + a(t)^T \delta)(t - (m + a(t)^T \gamma)))} ]
where [ C(t) ] is the carrying capacity (maximum capacity), [ k ] is the base growth rate, [ \delta ] represents changes in the growth rate at changepoint locations.
Seasonal component
Seasonality is modeled with Fourier series:
where [ P ] is the seasonality period, [ N ] is the number of harmonics (fourier_order).
Installation and environment setup
Basic installation
pip install prophet
Installation with extra dependencies
pip install prophet[plot]
pip install plotly # for interactive visualisation
Installation for different operating systems
Windows:
pip install pystan
pip install prophet
macOS:
brew install gcc
pip install prophet
Linux (Ubuntu/Debian):
sudo apt-get install build-essential
pip install prophet
Importing the library
from prophet import Prophet
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from prophet.plot import plot_plotly, plot_components_plotly
Preparing data for the model
Data format requirements
Prophet strictly requires the following data structure:
- Column ‘ds’ – timestamps (datetime)
- Column ‘y’ – time‑series values (numeric)
Example data preparation
import pandas as pd
from datetime import datetime, timedelta
# Create example data
dates = pd.date_range(start='2020-01-01', end='2023-12-31', freq='D')
values = np.random.randn(len(dates)).cumsum() + 100
df = pd.DataFrame({
'ds': dates,
'y': values
})
# Load real data
df = pd.read_csv('your_data.csv')
df['ds'] = pd.to_datetime(df['date_column'])
df['y'] = pd.to_numeric(df['value_column'], errors='coerce')
# Drop missing values
df = df.dropna(subset=['ds', 'y'])
df = df.sort_values('ds').reset_index(drop=True)
Data validation
def validate_prophet_data(df):
"""Validate data for Prophet"""
assert 'ds' in df.columns, "Missing column 'ds'"
assert 'y' in df.columns, "Missing column 'y'"
assert df['ds'].dtype == 'datetime64[ns]', "Column 'ds' must be datetime"
assert pd.api.types.is_numeric_dtype(df['y']), "Column 'y' must be numeric"
assert df['ds'].is_monotonic_increasing, "Dates must be sorted"
print(f"Data is valid. Period: {df['ds'].min()} - {df['ds'].max()}")
print(f"Number of observations: {len(df)}")
print(f"Missing values: {df['y'].isna().sum()}")
Basic model training and forecasting
Model creation and training
# Initialise model
model = Prophet()
# Fit model
model.fit(df)
# Create future dataframe
future = model.make_future_dataframe(periods=365) # one‑year forecast
# Generate forecast
forecast = model.predict(future)
Forecast results analysis
# View key forecast columns
forecast_columns = ['ds', 'yhat', 'yhat_lower', 'yhat_upper', 'trend', 'seasonal']
print(forecast[forecast_columns].tail(10))
# Forecast statistics
print(f"Average forecast: {forecast['yhat'].mean():.2f}")
print(f"Confidence interval: [{forecast['yhat_lower'].mean():.2f}, {forecast['yhat_upper'].mean():.2f}]")
Result visualisation
Static visualisation
import matplotlib.pyplot as plt
# Main forecast plot
fig1 = model.plot(forecast)
plt.title('Time‑series forecast')
plt.xlabel('Date')
plt.ylabel('Value')
plt.show()
# Component analysis
fig2 = model.plot_components(forecast)
plt.show()
Interactive visualisation
from prophet.plot import plot_plotly, plot_components_plotly
# Interactive forecast plot
fig = plot_plotly(model, forecast)
fig.show()
# Interactive component analysis
fig_components = plot_components_plotly(model, forecast)
fig_components.show()
Custom visualisation
def plot_forecast_custom(model, forecast, df):
"""Custom visualisation of the forecast"""
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(15, 10))
# Main forecast
ax1.plot(df['ds'], df['y'], 'ko', markersize=3, label='Historical data')
ax1.plot(forecast['ds'], forecast['yhat'], 'b-', label='Forecast')
ax1.fill_between(forecast['ds'], forecast['yhat_lower'], forecast['yhat_upper'],
alpha=0.3, color='blue', label='Confidence interval')
ax1.set_title('Time‑series forecast')
ax1.legend()
ax1.grid(True)
# Residuals
historical_forecast = forecast[forecast['ds'].isin(df['ds'])]
residuals = df['y'].values - historical_forecast['yhat'].values
ax2.plot(df['ds'], residuals, 'ro', markersize=2)
ax2.axhline(y=0, color='k', linestyle='--')
ax2.set_title('Model residuals')
ax2.grid(True)
plt.tight_layout()
plt.show()
Component analysis of the time series
Trend analysis
# Extract trend
trend = forecast['trend']
print(f"Initial trend value: {trend.iloc[0]:.2f}")
print(f"Final trend value: {trend.iloc[-1]:.2f}")
print(f"Total growth: {trend.iloc[-1] - trend.iloc[0]:.2f}")
# Growth rate
growth_rate = (trend.iloc[-1] - trend.iloc[0]) / len(trend)
print(f"Average daily growth rate: {growth_rate:.4f} units per day")
Seasonality analysis
# Extract seasonal components
weekly_seasonality = forecast.filter(regex='weekly').mean()
yearly_seasonality = forecast.filter(regex='yearly').mean()
print("Seasonal effects:")
print(f"Weekly seasonality: {weekly_seasonality.abs().mean():.2f}")
print(f"Yearly seasonality: {yearly_seasonality.abs().mean():.2f}")
Model parameter configuration
Core parameters
model = Prophet(
# Seasonality
daily_seasonality=True, # Daily seasonality
weekly_seasonality=True, # Weekly seasonality
yearly_seasonality=True, # Yearly seasonality
# Seasonality mode
seasonality_mode='additive', # 'additive' or 'multiplicative'
# Sensitivity to changes
changepoint_prior_scale=0.05, # Trend change sensitivity
seasonality_prior_scale=10.0, # Seasonality regularisation
holidays_prior_scale=10.0, # Holiday regularisation
# Confidence interval
interval_width=0.95, # Width of confidence interval
# Number of potential changepoints
n_changepoints=25, # Potential changepoints count
# Changepoint search range
changepoint_range=0.8 # Fraction of data used for changepoint search
)
Parameter optimisation
from sklearn.metrics import mean_absolute_error, mean_squared_error
def optimize_prophet_parameters(df, param_grid):
"""Optimise Prophet hyper‑parameters"""
best_mae = float('inf')
best_params = None
# Train‑test split
train_size = int(len(df) * 0.8)
train_df = df[:train_size]
test_df = df[train_size:]
for params in param_grid:
model = Prophet(**params)
model.fit(train_df)
# Forecast on test set
future = model.make_future_dataframe(periods=len(test_df))
forecast = model.predict(future)
# Compute MAE
test_forecast = forecast.tail(len(test_df))
mae = mean_absolute_error(test_df['y'], test_forecast['yhat'])
if mae < best_mae:
best_mae = mae
best_params = params
return best_params, best_mae
# Example parameter grid
param_grid = [
{'changepoint_prior_scale': 0.01, 'seasonality_prior_scale': 0.1},
{'changepoint_prior_scale': 0.05, 'seasonality_prior_scale': 1.0},
{'changepoint_prior_scale': 0.1, 'seasonality_prior_scale': 10.0},
]
Working with holidays and special events
Creating a holiday calendar
# Define holidays
holidays = pd.DataFrame({
'holiday': ['New Year', 'Christmas', 'Defender of the Fatherland Day',
'International Women\'s Day', 'Spring and Labour Day'],
'ds': pd.to_datetime(['2023-01-01', '2023-01-07', '2023-02-23',
'2023-03-08', '2023-05-01']),
'lower_window': [-1, -1, 0, 0, 0], # Days before the holiday
'upper_window': [1, 1, 0, 0, 0], # Days after the holiday
})
# Use built‑in country holidays
model = Prophet()
model.add_country_holidays(country_name='RU') # Russian holidays
model.fit(df)
Custom events
# Define promotional campaigns
promo_events = pd.DataFrame({
'holiday': 'promo_campaign',
'ds': pd.to_datetime(['2023-03-15', '2023-06-15', '2023-09-15', '2023-12-15']),
'lower_window': 0,
'upper_window': 7, # Effect lasts one week
})
# Combine with standard holidays
all_holidays = pd.concat([holidays, promo_events], ignore_index=True)
model = Prophet(holidays=all_holidays)
model.fit(df)
Adding external regressors
Simple regressor
# Add temperature as a regressor
df['temperature'] = np.random.normal(20, 10, len(df))
model = Prophet()
model.add_regressor('temperature')
model.fit(df)
# Future regressor values are required for forecasting
future = model.make_future_dataframe(periods=30)
future['temperature'] = np.random.normal(20, 10, len(future))
forecast = model.predict(future)
Multiple regressors
# Add several regressors
df['marketing_spend'] = np.random.exponential(1000, len(df))
df['competitor_price'] = np.random.normal(100, 20, len(df))
model = Prophet()
model.add_regressor('temperature')
model.add_regressor('marketing_spend')
model.add_regressor('competitor_price', standardize=True)
model.fit(df)
Custom seasonality
Adding a custom seasonality
# Monthly seasonality
model = Prophet()
model.add_seasonality(name='monthly', period=30.5, fourier_order=5)
# Quarterly seasonality
model.add_seasonality(name='quarterly', period=91.25, fourier_order=8)
# Conditional seasonality (e.g., weekends only)
def is_weekend(ds):
date = pd.to_datetime(ds)
return date.weekday() >= 5
df['is_weekend'] = df['ds'].apply(is_weekend)
model.add_seasonality(name='weekend_seasonality', period=7, fourier_order=3, condition_name='is_weekend')
Disabling standard seasonality
# Turn off built‑in seasonality
model = Prophet(
daily_seasonality=False,
weekly_seasonality=False,
yearly_seasonality=False
)
# Add only the required seasonality
model.add_seasonality(name='custom_yearly', period=365.25, fourier_order=10)
Outlier and missing‑value handling
Automatic outlier handling
def detect_outliers(df, threshold=3):
"""Detect outliers using Z‑score"""
z_scores = np.abs((df['y'] - df['y'].mean()) / df['y'].std())
outliers = df[z_scores > threshold]
return outliers
# Detect outliers
outliers = detect_outliers(df)
print(f"Found outliers: {len(outliers)}")
# Replace outliers with NaN (Prophet will handle them automatically)
df_cleaned = df.copy()
df_cleaned.loc[df_cleaned['y'].isin(outliers['y']), 'y'] = np.nan
Interpolating missing values
# Prophet handles gaps automatically, but you can pre‑interpolate if desired
df_interpolated = df.copy()
df_interpolated['y'] = df_interpolated['y'].interpolate(method='linear')
Cross‑validation and performance metrics
Time‑series cross‑validation
from prophet.diagnostics import cross_validation, performance_metrics
# Cross‑validation
df_cv = cross_validation(
model,
initial='730 days', # Initial training period
period='180 days', # Validation step
horizon='365 days' # Forecast horizon
)
# Performance metrics
df_performance = performance_metrics(df_cv)
print(df_performance)
Visualising cross‑validation
from prophet.plot import plot_cross_validation_metric
# Metric plot over forecast horizon
fig = plot_cross_validation_metric(df_cv, metric='mape')
plt.show()
Custom metrics
def calculate_custom_metrics(df_cv):
"""Calculate additional evaluation metrics"""
metrics = {}
# Mean Absolute Percentage Error
metrics['mape'] = np.mean(np.abs((df_cv['y'] - df_cv['yhat']) / df_cv['y'])) * 100
# Coefficient of determination (R²)
ss_res = np.sum((df_cv['y'] - df_cv['yhat']) ** 2)
ss_tot = np.sum((df_cv['y'] - np.mean(df_cv['y'])) ** 2)
metrics['r2'] = 1 - (ss_res / ss_tot)
# Symmetric MAPE
metrics['smape'] = np.mean(2 * np.abs(df_cv['y'] - df_cv['yhat']) /
(np.abs(df_cv['y']) + np.abs(df_cv['yhat']))) * 100
return metrics
Integration with popular libraries
Scikit‑learn integration
from sklearn.base import BaseEstimator, RegressorMixin
from sklearn.utils.validation import check_X_y, check_array
class ProphetRegressor(BaseEstimator, RegressorMixin):
"""Prophet wrapper for Scikit‑learn pipelines"""
def __init__(self, **prophet_params):
self.prophet_params = prophet_params
self.model = None
def fit(self, X, y):
df = pd.DataFrame({'ds': X.flatten(), 'y': y})
self.model = Prophet(**self.prophet_params)
self.model.fit(df)
return self
def predict(self, X):
if self.model is None:
raise ValueError("Model not fitted yet")
future = pd.DataFrame({'ds': X.flatten()})
forecast = self.model.predict(future)
return forecast['yhat'].values
# Use in a pipeline
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
prophet_regressor = ProphetRegressor(yearly_seasonality=True)
dates = pd.date_range('2020-01-01', periods=len(df), freq='D')
prophet_regressor.fit(dates.values.reshape(-1, 1), df['y'].values)
PyCaret integration
# Install PyCaret
# pip install pycaret
# Use Prophet inside PyCaret
from pycaret.time_series import *
# Initialise experiment
exp = setup(
data=df,
target='y',
session_id=123,
train_size=0.8,
fold_strategy='expanding',
fold=3
)
# Create Prophet model
prophet_model = create_model('prophet')
# Finalise model
finalized_model = finalize_model(prophet_model)
# Forecast
predictions = predict_model(finalized_model, fh=30)
Complete table of Prophet methods and functions
Core Prophet class methods
| Method | Description | Parameters | Return value |
|---|---|---|---|
__init__() |
Model initialisation | growth='linear', changepoints=None, n_changepoints=25, changepoint_range=0.8, yearly_seasonality='auto', weekly_seasonality='auto', daily_seasonality='auto', holidays=None, seasonality_mode='additive', seasonality_prior_scale=10.0, holidays_prior_scale=10.0, changepoint_prior_scale=0.05, mcmc_samples=0, interval_width=0.80, uncertainty_samples=1000, stan_backend='PYSTAN' |
Prophet object |
fit() |
Fit the model | df (DataFrame with ‘ds’ and ‘y’ columns), **kwargs |
Fitted model |
predict() |
Generate forecasts | df (DataFrame with ‘ds’ column) |
DataFrame with predictions |
make_future_dataframe() |
Create future dates | periods (int), freq='D', include_history=True |
DataFrame of dates |
add_seasonality() |
Add a custom seasonality | name (str), period (float), fourier_order (int), prior_scale=None, mode=None, condition_name=None |
None |
add_regressor() |
Add an external regressor | name (str), prior_scale=None, standardize='auto', mode=None |
None |
add_country_holidays() |
Add built‑in country holidays | country_name (str) |
None |
plot() |
Plot the forecast | fcst (DataFrame), ax=None, uncertainty=True, plot_cap=True, xlabel='ds', ylabel='y' |
matplotlib.figure.Figure |
plot_components() |
Plot individual components | fcst (DataFrame), uncertainty=True, plot_cap=True, weekly_start=0, yearly_start=0 |
matplotlib.figure.Figure |
Helper functions
| Function | Description | Parameters | Return value |
|---|---|---|---|
cross_validation() |
Time‑series cross‑validation | model, horizon, initial=None, period=None, cutoffs=None, disable_tqdm=False |
DataFrame |
performance_metrics() |
Compute evaluation metrics | df (output of cross_validation), metrics=None, rolling_window=0.1 |
DataFrame |
plot_cross_validation_metric() |
Visualise a metric over the horizon | df_cv, metric, rolling_window=0.1 |
matplotlib.figure.Figure |
plot_plotly() |
Interactive Plotly forecast | m (model), fcst (forecast), uncertainty=True, plot_cap=True, trend=False, changepoints=False, changepoints_threshold=0.01 |
plotly.graph_objects.Figure |
plot_components_plotly() |
Interactive component plots | m (model), fcst (forecast), uncertainty=True, plot_cap=True, weekly_start=0, yearly_start=0 |
plotly.graph_objects.Figure |
Configuration parameters
| Parameter | Type | Default value | Description |
|---|---|---|---|
growth |
str | 'linear' | Growth type: 'linear' or 'logistic' |
changepoints |
list | None | Dates of trend changepoints |
n_changepoints |
int | 25 | Number of potential changepoints |
changepoint_range |
float | 0.8 | Fraction of data used to search for changepoints |
yearly_seasonality |
bool/str | 'auto' | Yearly seasonality |
weekly_seasonality |
bool/str | 'auto' | Weekly seasonality |
daily_seasonality |
bool/str | 'auto' | Daily seasonality |
holidays |
DataFrame | None | Holiday table |
seasonality_mode |
str | 'additive' | Seasonality mode: 'additive' or 'multiplicative' |
seasonality_prior_scale |
float | 10.0 | Seasonality regularisation strength |
holidays_prior_scale |
float | 10.0 | Holiday regularisation strength |
changepoint_prior_scale |
float | 0.05 | Sensitivity to trend changes |
interval_width |
float | 0.80 | Width of the confidence interval |
uncertainty_samples |
int | 1000 | Number of samples for uncertainty estimation |
Practical use‑case examples
E‑commerce sales forecasting
def forecast_ecommerce_sales(df):
"""Forecast sales for an online store"""
# Define holidays
holidays = pd.DataFrame({
'holiday': ['Black Friday', 'Cyber Monday', 'New Year Sale'],
'ds': pd.to_datetime(['2023-11-24', '2023-11-27', '2023-01-01']),
'lower_window': [-7, -3, 0],
'upper_window': [1, 1, 7],
})
# Define marketing campaigns
marketing_events = pd.DataFrame({
'holiday': 'marketing_campaign',
'ds': pd.to_datetime(['2023-03-15', '2023-06-15', '2023-09-15']),
'lower_window': 0,
'upper_window': 14,
})
all_holidays = pd.concat([holidays, marketing_events], ignore_index=True)
# Initialise model
model = Prophet(
holidays=all_holidays,
seasonality_mode='multiplicative',
yearly_seasonality=True,
weekly_seasonality=True,
daily_seasonality=False
)
# Add regressors
model.add_regressor('marketing_spend')
model.add_regressor('competitor_price', standardize=True)
# Fit
model.fit(df)
# 90‑day forecast
future = model.make_future_dataframe(periods=90)
future['marketing_spend'] = np.random.exponential(1000, len(future))
future['competitor_price'] = np.random.normal(100, 20, len(future))
forecast = model.predict(future)
return model, forecast
Website traffic forecasting
def forecast_website_traffic(df):
"""Forecast website visitor traffic"""
# Flag weekends
df['is_weekend'] = df['ds'].dt.dayofweek.isin([5, 6]).astype(int)
# Initialise model
model = Prophet(
daily_seasonality=True,
weekly_seasonality=False, # Disable standard weekly seasonality
yearly_seasonality=True,
changepoint_prior_scale=0.1
)
# Conditional weekly seasonality for workdays/weekends
model.add_seasonality(
name='weekly_workdays',
period=7,
fourier_order=3,
condition_name='is_weekend'
)
# Add regressor
model.add_regressor('is_weekend')
# Fit
model.fit(df)
# 30‑day forecast
future = model.make_future_dataframe(periods=30)
future['is_weekend'] = future['ds'].dt.dayofweek.isin([5, 6]).astype(int)
forecast = model.predict(future)
return model, forecast
Energy consumption forecasting
def forecast_energy_consumption(df):
"""Forecast energy consumption"""
# Initialise model
model = Prophet(
yearly_seasonality=True,
weekly_seasonality=True,
daily_seasonality=True,
seasonality_mode='additive'
)
# Add regressors
model.add_regressor('temperature')
model.add_regressor('humidity')
model.add_regressor('is_holiday')
# Custom seasonality for heating season
model.add_seasonality(
name='heating_season',
period=365.25,
fourier_order=8,
condition_name='is_winter'
)
# Fit
model.fit(df)
return model
Performance optimisation
Accelerating training
# Use CMDSTAN for large datasets
model = Prophet(stan_backend='CMDSTANPY')
# Reduce number of uncertainty samples
model = Prophet(uncertainty_samples=100)
# Aggregate data for very large series
df_daily = df.groupby(df['ds'].dt.date).agg({'y': 'mean'}).reset_index()
df_daily['ds'] = pd.to_datetime(df_daily['ds'])
Parallel training of multiple models
from concurrent.futures import ProcessPoolExecutor
import multiprocessing
def train_prophet_parallel(data_dict):
"""Parallel training of multiple Prophet models"""
def train_single_model(args):
name, df = args
model = Prophet()
model.fit(df)
return name, model
# Use all available CPU cores
with ProcessPoolExecutor(max_workers=multiprocessing.cpu_count()) as executor:
results = list(executor.map(train_single_model, data_dict.items()))
return dict(results)
Model diagnostics and debugging
Residual analysis
def analyze_residuals(model, df, forecast):
"""Analyse model residuals"""
# Historical forecasts
historical_forecast = forecast[forecast['ds'].isin(df['ds'])]
residuals = df['y'].values - historical_forecast['yhat'].values
# Residual statistics
stats = {
'mean_residual': np.mean(residuals),
'std_residual': np.std(residuals),
'min_residual': np.min(residuals),
'max_residual': np.max(residuals),
'mae': np.mean(np.abs(residuals)),
'rmse': np.sqrt(np.mean(residuals**2))
}
# Visualise residuals
fig, axes = plt.subplots(2, 2, figsize=(15, 10))
# Time‑series of residuals
axes[0, 0].plot(df['ds'], residuals)
axes[0, 0].axhline(y=0, color='r', linestyle='--')
axes[0, 0].set_title('Residuals over time')
axes[0, 0].set_xlabel('Date')
axes[0, 0].set_ylabel('Residual')
# Histogram
axes[0, 1].hist(residuals, bins=30, alpha=0.7)
axes[0, 1].set_title('Residual distribution')
axes[0, 1].set_xlabel('Residual')
axes[0, 1].set_ylabel('Frequency')
# Q‑Q plot
from scipy import stats
stats.probplot(residuals, dist="norm", plot=axes[1, 0])
axes[1, 0].set_title('Q‑Q plot')
# Autocorrelation
from statsmodels.stats.diagnostic import acorr_ljungbox
autocorr = [np.corrcoef(residuals[:-i], residuals[i:])[0, 1]
for i in range(1, min(40, len(residuals)))]
axes[1, 1].plot(autocorr)
axes[1, 1].axhline(y=0, color='r', linestyle='--')
axes[1, 1].set_title('Residual autocorrelation')
axes[1, 1].set_xlabel('Lag')
axes[1, 1].set_ylabel('Correlation')
plt.tight_layout()
plt.show()
return stats
Changepoint inspection
def analyze_changepoints(model, forecast):
"""Inspect trend changepoints"""
# Extract changepoints and their magnitudes
changepoints = model.changepoints
deltas = model.params['delta'].mean(axis=0)
# Build DataFrame
changepoint_df = pd.DataFrame({
'changepoint': changepoints,
'delta': deltas
})
# Filter significant changes
significant_changes = changepoint_df[np.abs(changepoint_df['delta']) > 0.01]
# Visualise
fig, ax = plt.subplots(figsize=(15, 8))
ax.plot(forecast['ds'], forecast['yhat'], label='Forecast')
ax.plot(forecast['ds'], forecast['trend'], label='Trend', linestyle='--')
for _, row in significant_changes.iterrows():
ax.axvline(x=row['changepoint'], color='red', linestyle=':', alpha=0.7)
ax.text(row['changepoint'], ax.get_ylim()[1] * 0.9,
f'Δ={row["delta"]:.3f}', rotation=90)
ax.set_title('Trend changepoints')
ax.legend()
plt.show()
return significant_changes
Comparison with other forecasting methods
Comparison with ARIMA
from statsmodels.tsa.arima.model import ARIMA
from sklearn.metrics import mean_absolute_error, mean_squared_error
def compare_prophet_arima(df, test_size=30):
"""Compare Prophet with ARIMA"""
# Train‑test split
train_df = df[:-test_size]
test_df = df[-test_size:]
# Prophet
prophet_model = Prophet()
prophet_model.fit(train_df)
future = prophet_model.make_future_dataframe(periods=test_size)
prophet_forecast = prophet_model.predict(future)
prophet_pred = prophet_forecast['yhat'].tail(test_size).values
# ARIMA
arima_model = ARIMA(train_df['y'], order=(1, 1, 1))
arima_fitted = arima_model.fit()
arima_pred = arima_fitted.forecast(steps=test_size)
# Metrics
prophet_mae = mean_absolute_error(test_df['y'], prophet_pred)
prophet_rmse = np.sqrt(mean_squared_error(test_df['y'], prophet_pred))
arima_mae = mean_absolute_error(test_df['y'], arima_pred)
arima_rmse = np.sqrt(mean_squared_error(test_df['y'], arima_pred))
# Results
comparison = pd.DataFrame({
'Model': ['Prophet', 'ARIMA'],
'MAE': [prophet_mae, arima_mae],
'RMSE': [prophet_rmse, arima_rmse]
})
return comparison
Frequently asked questions
Technical questions
What should I do if the model runs slowly?
- Use
stan_backend='CMDSTANPY'for large datasets - Reduce
uncertainty_samples - Aggregate data to larger time intervals
- Lower the number of Fourier terms in seasonality (
fourier_order)
How to handle multiple time series? Prophet does not support multivariate series directly. Train a separate model for each series or use external libraries such as sktime.
Can Prophet be used for classification? No, Prophet is designed solely for time‑series regression. Use specialised methods for time‑series classification.
Practical questions
How to choose an optimal forecast horizon? The horizon depends on the data characteristics and business objective. Forecast accuracy typically degrades as the horizon grows. Test different horizons using cross‑validation.
What to do with non‑stationary data? Prophet automatically handles many forms of non‑stationarity through trend and seasonality modeling. For complex cases, consider additional preprocessing.
How to interpret confidence intervals? Confidence intervals reflect model uncertainty. Wide intervals indicate high uncertainty; narrow intervals suggest confidence in the forecast.
Conclusion
Prophet is a powerful and versatile tool for time‑series forecasting that successfully combines statistical rigor with practical usability. The library excels in business scenarios where fast, interpretable results are needed without deep time‑series theory.
Key advantages of Prophet include automatic detection of seasonal patterns, robustness to outliers and missing values, the ability to incorporate external factors and holidays, and clear visualisation of results and components.
The library continues to evolve actively, receiving regular updates and improvements from the developer community. This makes Prophet a reliable choice for long‑term forecasting projects across diverse industries—from e‑commerce and finance to logistics and energy.
The Future of AI in Mathematics and Everyday Life: How Intelligent Agents Are Already Changing the Game
Experts warned about the risks of fake charity with AI
In Russia, universal AI-agent for robots and industrial processes was developed