Intermediate 25 min

Step 3: Define Message Schema

Before building the simulator, let’s design a message schema. Consistent message structure makes processing easier.

Why Schema Matters

Without a schema:

  • Each device sends different formats
  • Processing code gets messy
  • Hard to validate data
  • Difficult to add new fields

With a schema:

  • All devices use the same format
  • Easy to validate and parse
  • Simple to add new fields
  • Clear documentation

Our Message Schema

Here’s the schema we’ll use for telemetry:


              {
"deviceId": "sensor-01",
"ts": "2025-12-16T10:30:45.123Z",
"metrics": {
  "temperatureC": 23.5,
  "humidityPct": 65.2,
  "batteryPct": 87.0
},
"seq": 42
}
            

Schema Design Principles

1. Consistent Structure All messages have the same top-level fields. This makes parsing predictable.

2. ISO 8601 Timestamps Use ts (timestamp) in ISO 8601 format. It’s unambiguous and timezone-aware.

3. Nested Metrics Group related sensor readings together. Makes it easy to add new sensors later.

4. Sequence Numbers Include seq for message ordering. Helps detect dropped messages.

5. Device ID Always include deviceId. Essential for multi-device systems.

Example Messages

Here are some example messages following our schema:


              {
"deviceId": "sensor-01",
"ts": "2025-12-16T10:30:45.123Z",
"metrics": {
  "temperatureC": 23.5,
  "humidityPct": 65.2,
  "batteryPct": 87.0
},
"seq": 42
}
            

Topic Structure

We’ll use this topic pattern:

iot/devices/{deviceId}/telemetry

Examples:

  • iot/devices/sensor-01/telemetry
  • iot/devices/sensor-02/telemetry
  • iot/devices/room-101/telemetry

This structure:

  • Groups by device type (iot/devices)
  • Uses device ID in topic
  • Clear purpose (telemetry)

Validation Rules

When processing messages, validate:

  1. Required fields present

    • deviceId must exist
    • ts must be valid ISO 8601
    • metrics must be an object
    • seq must be a number
  2. Value ranges

    • temperatureC: -50 to 100 (reasonable range)
    • humidityPct: 0 to 100
    • batteryPct: 0 to 100
  3. Data types

    • All numbers are floats (except seq which is int)
    • deviceId is string
    • ts is string

Python Schema Helper

Here’s a Python function to create messages:

🐍 Python Message Creator
📟 Console Output
Run code to see output...

What NOT to Do

Bad examples:

Inconsistent structure:

{"temp": 23.5, "hum": 65.2}  // Missing fields, different names

No timestamp:

{"deviceId": "sensor-01", "temperature": 23.5}  // When was this?

Flat structure:

{"deviceId": "sensor-01", "temperature": 23.5, "humidity": 65.2, "battery": 87.0}  // Hard to extend

Good (our schema):

{
  "deviceId": "sensor-01",
  "ts": "2025-12-16T10:30:45.123Z",
  "metrics": {"temperatureC": 23.5, "humidityPct": 65.2, "batteryPct": 87.0},
  "seq": 1
}

Checkpoint ✅

You should understand:

  • ✅ Why consistent schemas matter
  • ✅ Our telemetry message structure
  • ✅ Topic naming pattern
  • ✅ Validation rules
  • ✅ How to create messages in Python

What’s Next?

In the next page, you’ll build the Python device simulator that publishes messages following this schema. It will send telemetry every 2 seconds with realistic sensor values.