Writing and Reading Time Series with TimeDB
This notebook demonstrates core TimeDB workflows:
Creating series with labels
Inserting data via the fluent API
Exploring series with
list_series()andlist_labels()Reading data back
Progressive filtering with
.where()
[1]:
import pandas as pd
from datetime import datetime, timezone, timedelta
from dotenv import load_dotenv
from timedb import TimeDataClient
load_dotenv()
td = TimeDataClient()
Setup
Create a fresh schema with wind and solar series across two sites.
[2]:
td.delete()
td.create()
series_configs = [
{'name': 'wind_power', 'unit': 'MW', 'labels': {'turbine': 'T01', 'site': 'Gotland', 'type': 'onshore'}},
{'name': 'wind_power', 'unit': 'MW', 'labels': {'turbine': 'T02', 'site': 'Gotland', 'type': 'onshore'}},
{'name': 'wind_power', 'unit': 'MW', 'labels': {'turbine': 'T03', 'site': 'Gotland', 'type': 'offshore'}},
{'name': 'wind_power', 'unit': 'MW', 'labels': {'turbine': 'T01', 'site': 'Skane', 'type': 'onshore'}},
{'name': 'solar_power', 'unit': 'MW', 'labels': {'panel_id': 'P01', 'site': 'Gotland'}},
{'name': 'solar_power', 'unit': 'MW', 'labels': {'panel_id': 'P02', 'site': 'Skane'}},
]
for config in series_configs:
td.create_series(**config)
print(f"Created {len(series_configs)} series")
Creating database schema...
✓ Schema created successfully
Created 6 series
Insert Data
DataFrames must have fixed columns: ['valid_time', 'value'] for point-in-time data, or ['valid_time', 'valid_time_end', 'value'] for intervals.
[3]:
base_time = datetime(2025, 1, 1, 0, 0, tzinfo=timezone.utc)
dates = [base_time + timedelta(hours=i) for i in range(24)]
# Insert data for each Gotland turbine
gotland_wind = td.series("wind_power").where(site="Gotland")
for turbine in gotland_wind.list_labels('turbine'):
df = pd.DataFrame({
'valid_time': dates,
'value': [50.0 + i * 1.5 for i in range(24)],
})
gotland_wind.where(turbine=turbine).insert(df)
print(f"Inserted 24 points for turbine {turbine}")
Inserted 24 points for turbine T01
Inserted 24 points for turbine T02
Inserted 24 points for turbine T03
Explore Series
Use list_series() to see full metadata and list_labels() for quick label discovery.
[4]:
# list_series() returns full metadata for each matching series
all_wind = td.series("wind_power")
print(f"All wind power series ({all_wind.count()}):\n")
for s in all_wind.list_series():
print(f" id={s['series_id']} unit={s['unit']} labels={s['labels']}")
All wind power series (4):
id=1 unit=MW labels={'site': 'Gotland', 'type': 'onshore', 'turbine': 'T01'}
id=2 unit=MW labels={'site': 'Gotland', 'type': 'onshore', 'turbine': 'T02'}
id=3 unit=MW labels={'site': 'Gotland', 'type': 'offshore', 'turbine': 'T03'}
id=4 unit=MW labels={'site': 'Skane', 'type': 'onshore', 'turbine': 'T01'}
[5]:
# list_labels() returns unique values for a single label dimension
wind = td.series("wind_power")
for site in wind.list_labels("site"):
site_wind = wind.where(site=site)
turbines = site_wind.list_labels("turbine")
print(f"{site}: {site_wind.count()} turbines - {turbines}")
Gotland: 3 turbines - ['T01', 'T02', 'T03']
Skane: 1 turbines - ['T01']
Read Data
TimeDB enforces single-series reads for data consistency. To read multiple series, use a loop or read them individually.
[6]:
# Read a single series
df_t01 = td.series("wind_power").where(site="Gotland", turbine="T01").read()
print("T01 at Gotland:")
print(df_t01.head())
# To read multiple series, loop through them
print(f"\nReading all Gotland wind turbines:")
gotland_wind = td.series("wind_power").where(site="Gotland")
dfs = {}
for turbine in gotland_wind.list_labels('turbine'):
df = gotland_wind.where(turbine=turbine).read()
dfs[turbine] = df
print(f" {turbine}: {len(df)} rows")
# Combine into a single DataFrame if needed
df_combined = pd.concat(dfs, axis=1)
print(f"\nCombined DataFrame shape: {df_combined.shape}")
print(df_combined.head())
T01 at Gotland:
value
valid_time
2025-01-01 00:00:00+00:00 50.0
2025-01-01 01:00:00+00:00 51.5
2025-01-01 02:00:00+00:00 53.0
2025-01-01 03:00:00+00:00 54.5
2025-01-01 04:00:00+00:00 56.0
Reading all Gotland wind turbines:
T01: 24 rows
T02: 24 rows
T03: 24 rows
Combined DataFrame shape: (24, 3)
T01 T02 T03
value value value
valid_time
2025-01-01 00:00:00+00:00 50.0 50.0 50.0
2025-01-01 01:00:00+00:00 51.5 51.5 51.5
2025-01-01 02:00:00+00:00 53.0 53.0 53.0
2025-01-01 03:00:00+00:00 54.5 54.5 54.5
2025-01-01 04:00:00+00:00 56.0 56.0 56.0
Progressive Filtering
Start broad and narrow down with .where(). Each call returns a new collection.
[7]:
wind = td.series("wind_power")
print(f"All wind power: {wind.count()} series")
gotland = wind.where(site="Gotland")
print(f" Gotland: {gotland.count()} series")
onshore = gotland.where(type="onshore")
print(f" Onshore: {onshore.count()} series")
print(f" Turbines: {onshore.list_labels('turbine')}")
All wind power: 4 series
Gotland: 3 series
Onshore: 2 series
Turbines: ['T01', 'T02']
Error Handling
The API provides clear error messages for common mistakes.
[8]:
# 1. No series found
try:
td.series("wind_power").where(site="Mars").read()
except ValueError as e:
print(f"No match: {e}")
# 2. Wrong column names
try:
df_wrong = pd.DataFrame({'valid_time': dates[:3], 'power': [1, 2, 3]})
td.series("wind_power").where(site="Gotland", turbine="T01").insert(df_wrong)
except ValueError as e:
print(f"\nWrong columns: {e}")
# 3. Missing time column
try:
td.series("wind_power").where(site="Gotland", turbine="T01").insert(pd.DataFrame({'value': [1.0]}))
except ValueError as e:
print(f"\nMissing column: {e}")
No match: No series found matching filters: name=wind_power, unit=None, labels={'site': 'Mars'}
Wrong columns: For point-in-time data, DataFrame must have columns ['valid_time', 'value']. Found: ['valid_time', 'power']
Missing column: DataFrame must have 2 columns ['valid_time', 'value'] or 3 columns ['valid_time', 'valid_time_end', 'value']. Found 1 columns: ['value']
Best Practices
Single-series operations: All read, insert, and update operations work on a single series at a time
Use
list_series()andcount()to verify your filters resolve to one seriesNarrow to a single series with
.where()before reading or insertingFixed DataFrame columns: Use
['valid_time', 'value']for point-in-time or['valid_time', 'valid_time_end', 'value']for intervalsvalid_timemust have timezone-aware timestampsUse
overlapping=Truefor forecasts/versioned data (default is flat/immutable)To work with multiple series, loop through them using
list_labels()orlist_series()