Writing and Reading Time Series with TimeDB

This notebook demonstrates core TimeDB workflows:

  1. Creating series with labels

  2. Inserting data via the fluent API

  3. Exploring series with list_series() and list_labels()

  4. Reading data back

  5. Progressive filtering with .where()

[1]:
import pandas as pd
from datetime import datetime, timezone, timedelta
from dotenv import load_dotenv
from timedb import TimeDataClient
load_dotenv()

td = TimeDataClient()

Setup

Create a fresh schema with wind and solar series across two sites.

[2]:
td.delete()
td.create()

series_configs = [
    {'name': 'wind_power', 'unit': 'MW', 'labels': {'turbine': 'T01', 'site': 'Gotland', 'type': 'onshore'}},
    {'name': 'wind_power', 'unit': 'MW', 'labels': {'turbine': 'T02', 'site': 'Gotland', 'type': 'onshore'}},
    {'name': 'wind_power', 'unit': 'MW', 'labels': {'turbine': 'T03', 'site': 'Gotland', 'type': 'offshore'}},
    {'name': 'wind_power', 'unit': 'MW', 'labels': {'turbine': 'T01', 'site': 'Skane', 'type': 'onshore'}},
    {'name': 'solar_power', 'unit': 'MW', 'labels': {'panel_id': 'P01', 'site': 'Gotland'}},
    {'name': 'solar_power', 'unit': 'MW', 'labels': {'panel_id': 'P02', 'site': 'Skane'}},
]

for config in series_configs:
    td.create_series(**config)

print(f"Created {len(series_configs)} series")
Creating database schema...
✓ Schema created successfully
Created 6 series

Insert Data

DataFrames must have fixed columns: ['valid_time', 'value'] for point-in-time data, or ['valid_time', 'valid_time_end', 'value'] for intervals.

[3]:
base_time = datetime(2025, 1, 1, 0, 0, tzinfo=timezone.utc)
dates = [base_time + timedelta(hours=i) for i in range(24)]

# Insert data for each Gotland turbine
gotland_wind = td.series("wind_power").where(site="Gotland")

for turbine in gotland_wind.list_labels('turbine'):
    df = pd.DataFrame({
        'valid_time': dates,
        'value': [50.0 + i * 1.5 for i in range(24)],
    })
    gotland_wind.where(turbine=turbine).insert(df)
    print(f"Inserted 24 points for turbine {turbine}")
Inserted 24 points for turbine T01
Inserted 24 points for turbine T02
Inserted 24 points for turbine T03

Explore Series

Use list_series() to see full metadata and list_labels() for quick label discovery.

[4]:
# list_series() returns full metadata for each matching series
all_wind = td.series("wind_power")
print(f"All wind power series ({all_wind.count()}):\n")

for s in all_wind.list_series():
    print(f"  id={s['series_id']}  unit={s['unit']}  labels={s['labels']}")
All wind power series (4):

  id=1  unit=MW  labels={'site': 'Gotland', 'type': 'onshore', 'turbine': 'T01'}
  id=2  unit=MW  labels={'site': 'Gotland', 'type': 'onshore', 'turbine': 'T02'}
  id=3  unit=MW  labels={'site': 'Gotland', 'type': 'offshore', 'turbine': 'T03'}
  id=4  unit=MW  labels={'site': 'Skane', 'type': 'onshore', 'turbine': 'T01'}
[5]:
# list_labels() returns unique values for a single label dimension
wind = td.series("wind_power")

for site in wind.list_labels("site"):
    site_wind = wind.where(site=site)
    turbines = site_wind.list_labels("turbine")
    print(f"{site}: {site_wind.count()} turbines - {turbines}")
Gotland: 3 turbines - ['T01', 'T02', 'T03']
Skane: 1 turbines - ['T01']

Read Data

TimeDB enforces single-series reads for data consistency. To read multiple series, use a loop or read them individually.

[6]:
# Read a single series
df_t01 = td.series("wind_power").where(site="Gotland", turbine="T01").read()
print("T01 at Gotland:")
print(df_t01.head())

# To read multiple series, loop through them
print(f"\nReading all Gotland wind turbines:")
gotland_wind = td.series("wind_power").where(site="Gotland")

dfs = {}
for turbine in gotland_wind.list_labels('turbine'):
    df = gotland_wind.where(turbine=turbine).read()
    dfs[turbine] = df
    print(f"  {turbine}: {len(df)} rows")

# Combine into a single DataFrame if needed
df_combined = pd.concat(dfs, axis=1)
print(f"\nCombined DataFrame shape: {df_combined.shape}")
print(df_combined.head())
T01 at Gotland:
                           value
valid_time
2025-01-01 00:00:00+00:00   50.0
2025-01-01 01:00:00+00:00   51.5
2025-01-01 02:00:00+00:00   53.0
2025-01-01 03:00:00+00:00   54.5
2025-01-01 04:00:00+00:00   56.0

Reading all Gotland wind turbines:
  T01: 24 rows
  T02: 24 rows
  T03: 24 rows

Combined DataFrame shape: (24, 3)
                            T01   T02   T03
                          value value value
valid_time
2025-01-01 00:00:00+00:00  50.0  50.0  50.0
2025-01-01 01:00:00+00:00  51.5  51.5  51.5
2025-01-01 02:00:00+00:00  53.0  53.0  53.0
2025-01-01 03:00:00+00:00  54.5  54.5  54.5
2025-01-01 04:00:00+00:00  56.0  56.0  56.0

Progressive Filtering

Start broad and narrow down with .where(). Each call returns a new collection.

[7]:
wind = td.series("wind_power")
print(f"All wind power: {wind.count()} series")

gotland = wind.where(site="Gotland")
print(f"  Gotland: {gotland.count()} series")

onshore = gotland.where(type="onshore")
print(f"    Onshore: {onshore.count()} series")
print(f"    Turbines: {onshore.list_labels('turbine')}")
All wind power: 4 series
  Gotland: 3 series
    Onshore: 2 series
    Turbines: ['T01', 'T02']

Error Handling

The API provides clear error messages for common mistakes.

[8]:
# 1. No series found
try:
    td.series("wind_power").where(site="Mars").read()
except ValueError as e:
    print(f"No match: {e}")

# 2. Wrong column names
try:
    df_wrong = pd.DataFrame({'valid_time': dates[:3], 'power': [1, 2, 3]})
    td.series("wind_power").where(site="Gotland", turbine="T01").insert(df_wrong)
except ValueError as e:
    print(f"\nWrong columns: {e}")

# 3. Missing time column
try:
    td.series("wind_power").where(site="Gotland", turbine="T01").insert(pd.DataFrame({'value': [1.0]}))
except ValueError as e:
    print(f"\nMissing column: {e}")
No match: No series found matching filters: name=wind_power, unit=None, labels={'site': 'Mars'}

Wrong columns: For point-in-time data, DataFrame must have columns ['valid_time', 'value']. Found: ['valid_time', 'power']

Missing column: DataFrame must have 2 columns ['valid_time', 'value'] or 3 columns ['valid_time', 'valid_time_end', 'value']. Found 1 columns: ['value']

Best Practices

  • Single-series operations: All read, insert, and update operations work on a single series at a time

  • Use list_series() and count() to verify your filters resolve to one series

  • Narrow to a single series with .where() before reading or inserting

  • Fixed DataFrame columns: Use ['valid_time', 'value'] for point-in-time or ['valid_time', 'valid_time_end', 'value'] for intervals

  • valid_time must have timezone-aware timestamps

  • Use overlapping=True for forecasts/versioned data (default is flat/immutable)

  • To work with multiple series, loop through them using list_labels() or list_series()