Unit Validation and Conversion with TimeDB

This notebook demonstrates TimeDB’s pint unit handling:

  1. Inserting data with pint-pandas dtypes — automatic unit conversion on insert

  2. Reading data back with as_pint=True to get pint-typed columns

  3. Automatic conversion between compatible units (kW → MW)

  4. Rejection of incompatible units (MWh vs MW)

Requires: pip install timedb[pint]

[1]:
import pandas as pd
import pint_pandas
from datetime import datetime, timezone, timedelta
from dotenv import load_dotenv
from timedb import TimeDataClient
load_dotenv()

td = TimeDataClient()
td.delete()
td.create()
Creating database schema...
✓ Schema created successfully

Insert Data with pint Units

Create series and insert data using pint-pandas dtypes. When the DataFrame’s value column has a pint dtype, timedb automatically validates the unit against the series’ canonical unit and converts if needed.

[2]:
base_time = datetime(2025, 1, 1, 0, 0, tzinfo=timezone.utc)
times = [base_time + timedelta(hours=i) for i in range(24)]

# Create series with specific units
td.create_series(name="power", unit="MW")
td.create_series(name="wind_speed", unit="m/s")
td.create_series(name="temperature", unit="degree_Celsius")

# Insert with matching pint units — stripped to plain floats internally
for name, unit, values in [
    ("power", "MW", [1.0 + i * 0.05 for i in range(24)]),
    ("wind_speed", "m/s", [5.0 + i * 0.2 for i in range(24)]),
    ("temperature", "degree_Celsius", [20.0 + i * 0.5 for i in range(24)]),
]:
    df = pd.DataFrame({
        "valid_time": times,
        "value": pd.array(values, dtype=f"pint[{unit}]"),
    })
    td.series(name).insert(df)

print("Inserted 3 series with pint units")
Inserted 3 series with pint units
[3]:
# Read back without pint (default) — plain float64
df_plain = td.series("power").read()
print("Default read (plain float64):")
print(f"  dtype: {df_plain['value'].dtype}")
print(f"  head: {df_plain['value'].head(3).tolist()}")
print()

# Read back with as_pint=True — pint dtype
df_pint = td.series("power").read(as_pint=True)
print("Read with as_pint=True:")
print(f"  dtype: {df_pint['value'].dtype}")
print(f"  head: {df_pint['value'].head(3).tolist()}")
Default read (plain float64):
  dtype: float64
  head: [1.0, 1.05, 1.1]

Read with as_pint=True:
  dtype: pint[megawatt][Float64]
  head: [<Quantity(1.0, 'megawatt')>, <Quantity(1.05, 'megawatt')>, <Quantity(1.1, 'megawatt')>]

Automatic Unit Conversion

Compatible units are automatically converted on insert. Inserting kW values into a MW series converts them to MW.

[4]:
# Insert kilowatt values into a megawatt series — auto-converted
new_times = [base_time + timedelta(hours=i) for i in range(24, 48)]

df_kw = pd.DataFrame({
    "valid_time": new_times,
    "value": pd.array([500.0] * 24, dtype="pint[kW]"),
})

td.series("power").insert(df_kw)

# Read back — should show 0.5 MW (500 kW converted)
df_check = td.series("power").read(
    start_valid=new_times[0],
    end_valid=new_times[0] + timedelta(hours=1)
)
print(f"Inserted 500 kW, stored as: {df_check['value'].iloc[0]} MW (auto-converted)")
Inserted 500 kW, stored as: 0.5 MW (auto-converted)

Dimensionless and Plain floats

  • Plain float64 (no pint dtype): passed through unchanged, no unit check — backward compatible

  • Pint dimensionless: treated as already in the series’ canonical unit

[5]:
# Plain float64 — works as before, no unit check
df_plain = pd.DataFrame({
    "valid_time": [base_time + timedelta(hours=48)],
    "value": [999.0],  # plain float, no pint
})
td.series("power").insert(df_plain)
print("Plain float64 insert: OK (no unit check)")

# Dimensionless pint — treated as series unit
df_dimless = pd.DataFrame({
    "valid_time": [base_time + timedelta(hours=49)],
    "value": pd.array([42.0], dtype="pint[dimensionless]"),
})
td.series("power").insert(df_dimless)
print("Pint dimensionless insert: OK (treated as MW)")
Plain float64 insert: OK (no unit check)
Pint dimensionless insert: OK (treated as MW)

Unit Validation — Incompatible Units Rejected

Incompatible units raise IncompatibleUnitError. MWh (energy) cannot be stored in a MW (power) series.

[6]:
from timedb.sdk import IncompatibleUnitError

# Try inserting MWh (energy) into MW (power) series — should fail
df_mwh = pd.DataFrame({
    "valid_time": new_times[:1],
    "value": pd.array([10.0], dtype="pint[MWh]"),
})

try:
    td.series("power").insert(df_mwh)
    print("Unexpected: should have failed")
except IncompatibleUnitError as e:
    print(f"Rejected: {type(e).__name__}")
    print(f"  {e}")
Rejected: IncompatibleUnitError
  Cannot convert 'megawatt_hour' to series unit 'MW'. Units are not dimensionally compatible.

Summary

  • Insert with pint dtype: Units are validated and converted automatically. Incompatible units are rejected.

  • Insert with plain float64: No unit check — backward compatible.

  • Read with ``as_pint=True``: Returns value column as pint-pandas dtype with the series’ canonical unit.

  • Read default: Returns plain float64 values (no pint dependency needed).