Step 3: Define Message Schema
Before building the simulator, let’s design a message schema. Consistent message structure makes processing easier.
Why Schema Matters
Without a schema:
- Each device sends different formats
- Processing code gets messy
- Hard to validate data
- Difficult to add new fields
With a schema:
- All devices use the same format
- Easy to validate and parse
- Simple to add new fields
- Clear documentation
Our Message Schema
Here’s the schema we’ll use for telemetry:
{
"deviceId": "sensor-01",
"ts": "2025-12-16T10:30:45.123Z",
"metrics": {
"temperatureC": 23.5,
"humidityPct": 65.2,
"batteryPct": 87.0
},
"seq": 42
}
Field Descriptions
- deviceId: Unique identifier for the device (string)
- ts: ISO 8601 timestamp when data was collected (string)
- metrics: Object containing sensor readings
- temperatureC: Temperature in Celsius (float)
- humidityPct: Humidity percentage 0-100 (float)
- batteryPct: Battery level 0-100 (float)
- seq: Sequence number for message ordering (integer)
Schema Design Principles
1. Consistent Structure All messages have the same top-level fields. This makes parsing predictable.
2. ISO 8601 Timestamps
Use ts (timestamp) in ISO 8601 format. It’s unambiguous and timezone-aware.
3. Nested Metrics Group related sensor readings together. Makes it easy to add new sensors later.
4. Sequence Numbers
Include seq for message ordering. Helps detect dropped messages.
5. Device ID
Always include deviceId. Essential for multi-device systems.
Example Messages
Here are some example messages following our schema:
{
"deviceId": "sensor-01",
"ts": "2025-12-16T10:30:45.123Z",
"metrics": {
"temperatureC": 23.5,
"humidityPct": 65.2,
"batteryPct": 87.0
},
"seq": 42
}
{
"deviceId": "sensor-01",
"ts": "2025-12-16T10:30:45.123Z",
"metrics": {
"temperatureC": 23.5,
"humidityPct": 65.2,
"batteryPct": 15.0
},
"seq": 43
}
{
"deviceId": "sensor-01",
"ts": "2025-12-16T10:30:45.123Z",
"metrics": {
"temperatureC": 35.8,
"humidityPct": 45.1,
"batteryPct": 82.0
},
"seq": 44
}
Topic Structure
We’ll use this topic pattern:
iot/devices/{deviceId}/telemetry
Examples:
iot/devices/sensor-01/telemetryiot/devices/sensor-02/telemetryiot/devices/room-101/telemetry
This structure:
- Groups by device type (
iot/devices) - Uses device ID in topic
- Clear purpose (
telemetry)
Validation Rules
When processing messages, validate:
-
Required fields present
deviceIdmust existtsmust be valid ISO 8601metricsmust be an objectseqmust be a number
-
Value ranges
temperatureC: -50 to 100 (reasonable range)humidityPct: 0 to 100batteryPct: 0 to 100
-
Data types
- All numbers are floats (except
seqwhich is int) deviceIdis stringtsis string
- All numbers are floats (except
Python Schema Helper
Here’s a Python function to create messages:
What NOT to Do
Bad examples:
❌ Inconsistent structure:
{"temp": 23.5, "hum": 65.2} // Missing fields, different names
❌ No timestamp:
{"deviceId": "sensor-01", "temperature": 23.5} // When was this?
❌ Flat structure:
{"deviceId": "sensor-01", "temperature": 23.5, "humidity": 65.2, "battery": 87.0} // Hard to extend
✅ Good (our schema):
{
"deviceId": "sensor-01",
"ts": "2025-12-16T10:30:45.123Z",
"metrics": {"temperatureC": 23.5, "humidityPct": 65.2, "batteryPct": 87.0},
"seq": 1
}
Checkpoint ✅
You should understand:
- ✅ Why consistent schemas matter
- ✅ Our telemetry message structure
- ✅ Topic naming pattern
- ✅ Validation rules
- ✅ How to create messages in Python
What’s Next?
In the next page, you’ll build the Python device simulator that publishes messages following this schema. It will send telemetry every 2 seconds with realistic sensor values.