Telemetry Pipeline for Driverless Fleets with MongoDB

Build a scalable, low-latency telemetry pipeline with MongoDB time-series for autonomous trucking and TMS integrations like Aurora-McLeod.

Hook: Why telemetry for driverless fleets is the hard, urgent problem

Autonomous trucking teams and TMS integrators face the same grinding reality: massive, high-velocity telemetry from vehicles must be stored, queried, and joined with Transportation Management System (TMS) events with predictable, low-latency behavior. You need to support real-time dispatching from platforms like Aurora-McLeod, analytics for safety and SLAs, and long-term retention for compliance — all while keeping operational overhead low. This article walks through a battle-tested architecture and tuning patterns using MongoDB time-series and Atlas services that production teams are running in 2026.

What changed in 2025-2026 and why it matters

By late 2025 the logistics industry accelerated TMS integrations with autonomous providers. The early Aurora-McLeod integration demonstrated a new expectation: TMS workflows must treat driverless trucks as first-class capacity with near-real-time state updates and tender/dispatch events. That forces telemetry systems to be low-latency and horizontally scalable — not just bulk stores used for batch analytics.

At the same time, managed DB platforms such as MongoDB Atlas continued maturing features around time-series collections, online archiving, change streams, multi-region clusters, and observability. For fleet telemetry engineers this means you can design a pipeline that:

Ingests millions of points per minute with bounded write latency
Supports near-real-time joins with TMS events for dispatching and ETA updates
Scales horizontally without manual sharding headaches
Retains and archives historical telemetry for audits and ML

High-level architecture

The following architecture balances low-latency writes, efficient storage, and query flexibility:

Edge/Vehicle Telemetry Agent: Runs on vehicle gateway or LTE endpoint; batches sensor data and telemetry, publishes to a message broker.
Message Broker: MQTT or Kafka at the edge; ensures backpressure and decouples intermittent connectivity. Use local buffering for disconnected vehicles.
Ingest Workers / Microservices: Stateless Node.js or Go workers that consume the broker, perform enrichment (map-matching, coordinate normalization), and write to MongoDB in bulk.
Hot Storage in MongoDB Atlas: Time-series collections for per-second telemetry, sharded for scale and low-latency reads.
Event Store for TMS Integration: A normalized collection for tenders, dispatch events, and status updates; integrated with change streams to notify TMS like McLeod.
Cold Archive: Atlas Online Archive or S3-backed Data Lake for older telemetry accessible via data federation.
Streaming and Webhooks: Change streams and Kafka Connectors to stream state to Aurora/McLeod and notify dispatch systems.

Schema and time-series design

Design the time-series model around read-and-write patterns. MongoDB time-series collections are optimized for append-heavy telemetry, grouping measurements into internal buckets.

Core schema

Use a single time-series collection for high-frequency position and sensor measurements. Put low-cardinality attributes in a meta field so they are indexed efficiently.

db.createCollection('telemetry', {
  timeseries: {
    timeField: 'ts',
    metaField: 'meta'
  },
  expireAfterSeconds: 0
})

Example document:

{
  ts: ISODate('2026-01-18T12:00:00Z'),
  meta: { vehicleId: 'V12345', fleetId: 'F001', region: 'US-W' },
  speed: 48.1,
  heading: 270,
  lat: 36.114647,
  lon: -115.172813,
  hdop: 0.9,
  eventFlags: 0
}

Choosing bucketGranularity and schema tradeoffs

Bucket sizing controls how many measurements are grouped before compression. For per-second position updates across thousands of vehicles, choose a bucket size that keeps buckets reasonably sized (MBs, not tens of MBs). Test with realistic traffic. If you have bursts of high-frequency sensors, consider a secondary collection per sensor class.

Sharding strategy for scale and low-latency

Sharding is the key to scale. Two patterns work well for driverless fleets:

Hashed shard key on vehicleId

Use a hashed shard key on meta.vehicleId to evenly distribute writes across shards. This avoids hotspots when a few vehicles send significantly more telemetry.

Example:

sh.shardCollection('db.telemetry', { 'meta.vehicleId': 'hashed' })

Zone sharding for regional locality

When low read latency matters for TMS users in a specific region, combine hashed distribution with zone sharding. Create location zones that pin vehicleId ranges to the nearest region. This keeps read latency low for dispatch dashboards while maintaining write distribution.

How to plan zones:

Map vehicleId ranges to zones using a stable hashing mapping
Assign zones to Atlas regions in the nearest cloud region
Balance between even write distribution and read locality

Ingestion best practices

Telemetry ingestion needs to be efficient and resilient.

Batch and bulk write

Buffer incoming messages and write with bulkWrite. Typical batch sizes are 500-2000 docs depending on document size and latency budget.

// Node.js example using mongodb driver
const bulk = telemetryCollection.initializeUnorderedBulkOp();
for (const doc of batch) {
  bulk.insert(doc);
}
await bulk.execute();

Tune writeConcern and retries

For low-latency telemetry you can use writeConcern w:1 with retryableWrites enabled. For critical state transitions (tender accepted, emergency stop) use majority or transactional writes to ensure consistent TMS state.

Connection pool and driver tuning

Use appropriate poolSize per ingest worker according to CPU and network.
Set socketTimeout and serverSelectionTimeout to allow quick failover.
Use connection multiplexing in Node.js to avoid per-request overhead.

Indexing and query patterns

Indexing for time-series workloads is different from OLTP. Focus on the metadata fields and time ranges.

Essential indexes

meta.vehicleId + ts ascending: primary for vehicle timeline queries.
Compound index on meta.fleetId + ts for fleet-level analytics and aggregation.
Partial indexes for anomaly queries (eg eventFlags > 0) to keep index size small.

db.telemetry.createIndex({ 'meta.vehicleId': 1, ts: 1 })
db.telemetry.createIndex({ 'meta.fleetId': 1, ts: 1 })
// partial index for events
db.telemetry.createIndex({ 'meta.vehicleId': 1, ts: 1 }, { partialFilterExpression: { eventFlags: { $gt: 0 } } })

Query shapes to optimize

Most queries are time-bounded and target a vehicle or a small set of vehicles. Ensure filters include meta.vehicleId or meta.fleetId and a time range. Use projections to limit fields returned (exclude bulky sensor payloads when not needed).

Joining telemetry with TMS events (Aurora-McLeod style)

Integrations between TMS and driverless stacks need consistent event propagation and low latency:

Maintain a normalized collection called dispatch_events with tender, dispatch, and status fields.
Use change streams on dispatch_events and telemetry to push relevant updates to the TMS via webhooks or to a Kafka topic consumed by the TMS adapter.
Enrich telemety writes with the current dispatch assignment to enable fast joins at query time.

// change stream to notify TMS of dispatch changes
const stream = db.collection('dispatch_events').watch([{ $match: { 'operationType': { $in: ['insert', 'update'] } } }]);
stream.on('change', change => {
  // push to Kafka or call McLeod webhook
});

For bidirectional integrations, TMS tenders should be written as events. The fleet controller consumes those events to accept/reject assignments and updates vehicle meta state.

Real-time analytics and aggregations

Use aggregation pipelines for windowed metrics and SLA calculations. Since MongoDB added improved window functions and pipeline optimizations, you can run efficient rolling-average and ETA pipelines.

// sample: compute last 5-min avg speed for a vehicle
db.telemetry.aggregate([
  { $match: { 'meta.vehicleId': 'V12345', ts: { $gte: new Date(Date.now() - 5*60*1000) } } },
  { $group: { _id: null, avgSpeed: { $avg: '$speed' } } }
])

For dashboards, precompute common aggregates via materialized views or scheduled aggregation jobs to keep UI latency low.

Hot/Warm/Cold data lifecycle

Telemetry storage costs and query patterns vary by age. Implement a tiered lifecycle:

Hot: Last 7-30 days of per-second telemetry in Atlas cluster for low-latency queries and dispatch needs.
Warm: Aggregated traces and event-level data for 6-12 months, either in Atlas with lower tier nodes or aggregated collections.
Cold: Raw historical telemetry archived to S3 via Atlas Online Archive or Data Lake for long-term retention and ML training.

Use TTL and Online Archive policies. Keep critical audit trails and dispatch events replicated longer with stricter retention.

Observability and performance tuning

Operational visibility is non-negotiable. Monitor:

Write and read latencies per shard
Bulk write sizes and retry rates
Cache miss rates and page faults
Slow query logs and index usage

Use Atlas Performance Advisor, slow query profiler, serverStatus metrics, and OpenTelemetry traces from your ingest workers. Periodically run resilience drills: scale a shard down, simulate network partitions, and validate failover and change-stream continuity.

Security, compliance, and backups

Autonomous fleet telemetry often includes PII and operationally sensitive data. Implement:

Encryption at rest and in transit
Field-level encryption for PII and cryptographic keys managed in a KMS
Fine-grained RBAC for services and human access
Audit logging for regulatory compliance
Continuous backups with point-in-time recovery and tested restores

Testing for scale: how to validate your pipeline

Load-testing is essential. Create a realistic replay of vehicle streams including spikes, offline buffering, and out-of-order messages. Key checks:

Sustained throughput at peak: writes/sec and average QPS
End-to-end latency: message arrival to DB commit to TMS notification
Shard balance and OOME under load
Recovery time for failover and resharding operations

Advanced strategies and 2026 predictions

Looking ahead in 2026, fleet telemetry systems will increasingly:

Shift more pre-processing to the edge to reduce cloud bandwidth and privacy surface.
Adopt hybrid real-time/ML inference pipelines where aggregated telemetry triggers ML models for route optimization and anomaly detection.
Use multi-cloud and multi-region Atlas clusters to meet TMS latency SLAs as autonomous trucking expands globally.

Architecturally, expect more TMS platforms to rely on streaming change events rather than polling. The Aurora-McLeod early integration is a proof point: immediate, in-dashboard tendering is compelling for carriers and shippers. Your DB pipeline should be built to emit consistent events and to let TMS systems consume canonical state quickly.

Practical checklist: build and tune your telemetry pipeline

Model telemetry as a MongoDB time-series collection with meta fields for vehicle and fleet.
Shard by meta.vehicleId (hashed) and add zone sharding if regionally sensitive reads are required.
Ingest with batched bulkWrite and retryable writes; tune writeConcern per operation criticality.
Create compound indexes for vehicleId + ts and partial indexes for event queries.
Use change streams to stream dispatch_events to the TMS and to notify downstream consumers.
Implement hot/warm/cold lifecycle with Online Archive and aggregation rollups.
Instrument with Atlas monitoring, OpenTelemetry, and run periodic resilience tests.
Enforce encryption, RBAC, and continuous backups with tested restores.

Example end-to-end Node.js snippets

Minimal ingestion loop with bulk writes and a change stream consumer for TMS notifications.

// ingestion worker (simplified)
const { MongoClient } = require('mongodb');
const client = new MongoClient(process.env.MONGO_URI, { retryWrites: true });
await client.connect();
const db = client.db('fleet');
const col = db.collection('telemetry');
async function ingestBatch(batch) {
  if (!batch.length) return;
  const ops = batch.map(doc => ({ insertOne: { document: doc } }));
  await col.bulkWrite(ops, { ordered: false });
}

// change stream to forward dispatch events to TMS
const dispatch = db.collection('dispatch_events');
const cs = dispatch.watch();
cs.on('change', async change => {
  // push change to Kafka or call TMS webhook
});

Actionable takeaways

Prioritize time-series collections for per-second telemetry to get storage and query efficiency.
Shard smartly — hashed vehicleId for write scale, zones for read locality.
Use change streams to integrate with TMS platforms like Aurora-McLeod in near real time.
Automate lifecycle with hot/warm/cold tiers — it controls cost and keeps the hot working set fast.
Test under realistic conditions including disconnects and spikes — production surprises are expensive.

Closing thoughts

Autonomous trucking is pushing telemetry systems into new territory. TMS integrations like Aurora-McLeod show the business value of getting telemetry and dispatch state into the hands of carriers immediately. By combining MongoDB time-series collections, sharding strategies, change streams, and a clear lifecycle strategy, you can build a pipeline that meets both the low-latency requirements of dispatching and the scale needed for fleet-wide analytics.

Call to action

Ready to prototype a telemetry pipeline? Start by modeling a small fleet in a MongoDB Atlas cluster, implement time-series collections and a change stream to a mock TMS, and run a realistic replay. If you want a hands-on walkthrough tailored to your fleet size and SLA targets, contact our engineering team for a workshop and performance assessment.

Designing a Telemetry Pipeline for Driverless Fleets with MongoDB

Hook: Why telemetry for driverless fleets is the hard, urgent problem

What changed in 2025-2026 and why it matters

High-level architecture

Schema and time-series design

Core schema

Choosing bucketGranularity and schema tradeoffs

Sharding strategy for scale and low-latency

Hashed shard key on vehicleId

Zone sharding for regional locality

Ingestion best practices

Batch and bulk write

Tune writeConcern and retries

Connection pool and driver tuning

Indexing and query patterns

Essential indexes

Query shapes to optimize

Joining telemetry with TMS events (Aurora-McLeod style)

Real-time analytics and aggregations

Hot/Warm/Cold data lifecycle

Observability and performance tuning

Security, compliance, and backups

Testing for scale: how to validate your pipeline

Advanced strategies and 2026 predictions

Practical checklist: build and tune your telemetry pipeline

Example end-to-end Node.js snippets

Actionable takeaways

Closing thoughts

Call to action

Related Topics

mongoose

Up Next

Mongoose vs Prisma for MongoDB Projects: Tradeoffs for Node.js Teams

Mongoose Logging Best Practices for API Debugging and Incident Response

Mongoose Backup and Restore Checklist for Small Production Teams

Hook: Why telemetry for driverless fleets is the hard, urgent problem

What changed in 2025-2026 and why it matters

High-level architecture

Schema and time-series design

Core schema

Choosing bucketGranularity and schema tradeoffs

Sharding strategy for scale and low-latency

Hashed shard key on vehicleId

Zone sharding for regional locality

Ingestion best practices

Batch and bulk write

Tune writeConcern and retries

Connection pool and driver tuning

Indexing and query patterns

Essential indexes

Query shapes to optimize

Joining telemetry with TMS events (Aurora-McLeod style)

Real-time analytics and aggregations

Hot/Warm/Cold data lifecycle

Observability and performance tuning

Security, compliance, and backups

Testing for scale: how to validate your pipeline

Advanced strategies and 2026 predictions

Practical checklist: build and tune your telemetry pipeline

Example end-to-end Node.js snippets

Actionable takeaways

Closing thoughts

Call to action

Related Reading

Related Topics

mongoose

Up Next

Mongoose vs Prisma for MongoDB Projects: Tradeoffs for Node.js Teams

Mongoose Logging Best Practices for API Debugging and Incident Response

Mongoose Backup and Restore Checklist for Small Production Teams