Designing a Telemetry Pipeline for Driverless Fleets with MongoDB
Build a scalable, low-latency telemetry pipeline with MongoDB time-series for autonomous trucking and TMS integrations like Aurora-McLeod.
Hook: Why telemetry for driverless fleets is the hard, urgent problem
Autonomous trucking teams and TMS integrators face the same grinding reality: massive, high-velocity telemetry from vehicles must be stored, queried, and joined with Transportation Management System (TMS) events with predictable, low-latency behavior. You need to support real-time dispatching from platforms like Aurora-McLeod, analytics for safety and SLAs, and long-term retention for compliance — all while keeping operational overhead low. This article walks through a battle-tested architecture and tuning patterns using MongoDB time-series and Atlas services that production teams are running in 2026.
What changed in 2025-2026 and why it matters
By late 2025 the logistics industry accelerated TMS integrations with autonomous providers. The early Aurora-McLeod integration demonstrated a new expectation: TMS workflows must treat driverless trucks as first-class capacity with near-real-time state updates and tender/dispatch events. That forces telemetry systems to be low-latency and horizontally scalable — not just bulk stores used for batch analytics.
At the same time, managed DB platforms such as MongoDB Atlas continued maturing features around time-series collections, online archiving, change streams, multi-region clusters, and observability. For fleet telemetry engineers this means you can design a pipeline that:
- Ingests millions of points per minute with bounded write latency
- Supports near-real-time joins with TMS events for dispatching and ETA updates
- Scales horizontally without manual sharding headaches
- Retains and archives historical telemetry for audits and ML
High-level architecture
The following architecture balances low-latency writes, efficient storage, and query flexibility:
- Edge/Vehicle Telemetry Agent: Runs on vehicle gateway or LTE endpoint; batches sensor data and telemetry, publishes to a message broker.
- Message Broker: MQTT or Kafka at the edge; ensures backpressure and decouples intermittent connectivity. Use local buffering for disconnected vehicles.
- Ingest Workers / Microservices: Stateless Node.js or Go workers that consume the broker, perform enrichment (map-matching, coordinate normalization), and write to MongoDB in bulk.
- Hot Storage in MongoDB Atlas: Time-series collections for per-second telemetry, sharded for scale and low-latency reads.
- Event Store for TMS Integration: A normalized collection for tenders, dispatch events, and status updates; integrated with change streams to notify TMS like McLeod.
- Cold Archive: Atlas Online Archive or S3-backed Data Lake for older telemetry accessible via data federation.
- Streaming and Webhooks: Change streams and Kafka Connectors to stream state to Aurora/McLeod and notify dispatch systems.
Schema and time-series design
Design the time-series model around read-and-write patterns. MongoDB time-series collections are optimized for append-heavy telemetry, grouping measurements into internal buckets.
Core schema
Use a single time-series collection for high-frequency position and sensor measurements. Put low-cardinality attributes in a meta field so they are indexed efficiently.
db.createCollection('telemetry', {
timeseries: {
timeField: 'ts',
metaField: 'meta'
},
expireAfterSeconds: 0
})
Example document:
{
ts: ISODate('2026-01-18T12:00:00Z'),
meta: { vehicleId: 'V12345', fleetId: 'F001', region: 'US-W' },
speed: 48.1,
heading: 270,
lat: 36.114647,
lon: -115.172813,
hdop: 0.9,
eventFlags: 0
}
Choosing bucketGranularity and schema tradeoffs
Bucket sizing controls how many measurements are grouped before compression. For per-second position updates across thousands of vehicles, choose a bucket size that keeps buckets reasonably sized (MBs, not tens of MBs). Test with realistic traffic. If you have bursts of high-frequency sensors, consider a secondary collection per sensor class.
Sharding strategy for scale and low-latency
Sharding is the key to scale. Two patterns work well for driverless fleets:
Hashed shard key on vehicleId
Use a hashed shard key on meta.vehicleId to evenly distribute writes across shards. This avoids hotspots when a few vehicles send significantly more telemetry.
Example:
sh.shardCollection('db.telemetry', { 'meta.vehicleId': 'hashed' })
Zone sharding for regional locality
When low read latency matters for TMS users in a specific region, combine hashed distribution with zone sharding. Create location zones that pin vehicleId ranges to the nearest region. This keeps read latency low for dispatch dashboards while maintaining write distribution.
How to plan zones:
- Map vehicleId ranges to zones using a stable hashing mapping
- Assign zones to Atlas regions in the nearest cloud region
- Balance between even write distribution and read locality
Ingestion best practices
Telemetry ingestion needs to be efficient and resilient.
Batch and bulk write
Buffer incoming messages and write with bulkWrite. Typical batch sizes are 500-2000 docs depending on document size and latency budget.
// Node.js example using mongodb driver
const bulk = telemetryCollection.initializeUnorderedBulkOp();
for (const doc of batch) {
bulk.insert(doc);
}
await bulk.execute();
Tune writeConcern and retries
For low-latency telemetry you can use writeConcern w:1 with retryableWrites enabled. For critical state transitions (tender accepted, emergency stop) use majority or transactional writes to ensure consistent TMS state.
Connection pool and driver tuning
- Use appropriate poolSize per ingest worker according to CPU and network.
- Set socketTimeout and serverSelectionTimeout to allow quick failover.
- Use connection multiplexing in Node.js to avoid per-request overhead.
Indexing and query patterns
Indexing for time-series workloads is different from OLTP. Focus on the metadata fields and time ranges.
Essential indexes
- meta.vehicleId + ts ascending: primary for vehicle timeline queries.
- Compound index on meta.fleetId + ts for fleet-level analytics and aggregation.
- Partial indexes for anomaly queries (eg eventFlags > 0) to keep index size small.
db.telemetry.createIndex({ 'meta.vehicleId': 1, ts: 1 })
db.telemetry.createIndex({ 'meta.fleetId': 1, ts: 1 })
// partial index for events
db.telemetry.createIndex({ 'meta.vehicleId': 1, ts: 1 }, { partialFilterExpression: { eventFlags: { $gt: 0 } } })
Query shapes to optimize
Most queries are time-bounded and target a vehicle or a small set of vehicles. Ensure filters include meta.vehicleId or meta.fleetId and a time range. Use projections to limit fields returned (exclude bulky sensor payloads when not needed).
Joining telemetry with TMS events (Aurora-McLeod style)
Integrations between TMS and driverless stacks need consistent event propagation and low latency:
- Maintain a normalized collection called dispatch_events with tender, dispatch, and status fields.
- Use change streams on dispatch_events and telemetry to push relevant updates to the TMS via webhooks or to a Kafka topic consumed by the TMS adapter.
- Enrich telemety writes with the current dispatch assignment to enable fast joins at query time.
// change stream to notify TMS of dispatch changes
const stream = db.collection('dispatch_events').watch([{ $match: { 'operationType': { $in: ['insert', 'update'] } } }]);
stream.on('change', change => {
// push to Kafka or call McLeod webhook
});
For bidirectional integrations, TMS tenders should be written as events. The fleet controller consumes those events to accept/reject assignments and updates vehicle meta state.
Real-time analytics and aggregations
Use aggregation pipelines for windowed metrics and SLA calculations. Since MongoDB added improved window functions and pipeline optimizations, you can run efficient rolling-average and ETA pipelines.
// sample: compute last 5-min avg speed for a vehicle
db.telemetry.aggregate([
{ $match: { 'meta.vehicleId': 'V12345', ts: { $gte: new Date(Date.now() - 5*60*1000) } } },
{ $group: { _id: null, avgSpeed: { $avg: '$speed' } } }
])
For dashboards, precompute common aggregates via materialized views or scheduled aggregation jobs to keep UI latency low.
Hot/Warm/Cold data lifecycle
Telemetry storage costs and query patterns vary by age. Implement a tiered lifecycle:
- Hot: Last 7-30 days of per-second telemetry in Atlas cluster for low-latency queries and dispatch needs.
- Warm: Aggregated traces and event-level data for 6-12 months, either in Atlas with lower tier nodes or aggregated collections.
- Cold: Raw historical telemetry archived to S3 via Atlas Online Archive or Data Lake for long-term retention and ML training.
Use TTL and Online Archive policies. Keep critical audit trails and dispatch events replicated longer with stricter retention.
Observability and performance tuning
Operational visibility is non-negotiable. Monitor:
- Write and read latencies per shard
- Bulk write sizes and retry rates
- Cache miss rates and page faults
- Slow query logs and index usage
Use Atlas Performance Advisor, slow query profiler, serverStatus metrics, and OpenTelemetry traces from your ingest workers. Periodically run resilience drills: scale a shard down, simulate network partitions, and validate failover and change-stream continuity.
Security, compliance, and backups
Autonomous fleet telemetry often includes PII and operationally sensitive data. Implement:
- Encryption at rest and in transit
- Field-level encryption for PII and cryptographic keys managed in a KMS
- Fine-grained RBAC for services and human access
- Audit logging for regulatory compliance
- Continuous backups with point-in-time recovery and tested restores
Testing for scale: how to validate your pipeline
Load-testing is essential. Create a realistic replay of vehicle streams including spikes, offline buffering, and out-of-order messages. Key checks:
- Sustained throughput at peak: writes/sec and average QPS
- End-to-end latency: message arrival to DB commit to TMS notification
- Shard balance and OOME under load
- Recovery time for failover and resharding operations
Advanced strategies and 2026 predictions
Looking ahead in 2026, fleet telemetry systems will increasingly:
- Shift more pre-processing to the edge to reduce cloud bandwidth and privacy surface.
- Adopt hybrid real-time/ML inference pipelines where aggregated telemetry triggers ML models for route optimization and anomaly detection.
- Use multi-cloud and multi-region Atlas clusters to meet TMS latency SLAs as autonomous trucking expands globally.
Architecturally, expect more TMS platforms to rely on streaming change events rather than polling. The Aurora-McLeod early integration is a proof point: immediate, in-dashboard tendering is compelling for carriers and shippers. Your DB pipeline should be built to emit consistent events and to let TMS systems consume canonical state quickly.
Practical checklist: build and tune your telemetry pipeline
- Model telemetry as a MongoDB time-series collection with meta fields for vehicle and fleet.
- Shard by meta.vehicleId (hashed) and add zone sharding if regionally sensitive reads are required.
- Ingest with batched bulkWrite and retryable writes; tune writeConcern per operation criticality.
- Create compound indexes for vehicleId + ts and partial indexes for event queries.
- Use change streams to stream dispatch_events to the TMS and to notify downstream consumers.
- Implement hot/warm/cold lifecycle with Online Archive and aggregation rollups.
- Instrument with Atlas monitoring, OpenTelemetry, and run periodic resilience tests.
- Enforce encryption, RBAC, and continuous backups with tested restores.
Example end-to-end Node.js snippets
Minimal ingestion loop with bulk writes and a change stream consumer for TMS notifications.
// ingestion worker (simplified)
const { MongoClient } = require('mongodb');
const client = new MongoClient(process.env.MONGO_URI, { retryWrites: true });
await client.connect();
const db = client.db('fleet');
const col = db.collection('telemetry');
async function ingestBatch(batch) {
if (!batch.length) return;
const ops = batch.map(doc => ({ insertOne: { document: doc } }));
await col.bulkWrite(ops, { ordered: false });
}
// change stream to forward dispatch events to TMS
const dispatch = db.collection('dispatch_events');
const cs = dispatch.watch();
cs.on('change', async change => {
// push change to Kafka or call TMS webhook
});
Actionable takeaways
- Prioritize time-series collections for per-second telemetry to get storage and query efficiency.
- Shard smartly — hashed vehicleId for write scale, zones for read locality.
- Use change streams to integrate with TMS platforms like Aurora-McLeod in near real time.
- Automate lifecycle with hot/warm/cold tiers — it controls cost and keeps the hot working set fast.
- Test under realistic conditions including disconnects and spikes — production surprises are expensive.
Closing thoughts
Autonomous trucking is pushing telemetry systems into new territory. TMS integrations like Aurora-McLeod show the business value of getting telemetry and dispatch state into the hands of carriers immediately. By combining MongoDB time-series collections, sharding strategies, change streams, and a clear lifecycle strategy, you can build a pipeline that meets both the low-latency requirements of dispatching and the scale needed for fleet-wide analytics.
Call to action
Ready to prototype a telemetry pipeline? Start by modeling a small fleet in a MongoDB Atlas cluster, implement time-series collections and a change stream to a mock TMS, and run a realistic replay. If you want a hands-on walkthrough tailored to your fleet size and SLA targets, contact our engineering team for a workshop and performance assessment.
Related Reading
- Sneaky Ways to Make Hotel Rooms Feel Like Home (Without Breaking Rules)
- DIY Patriotic Tailgate Syrups: Cocktail Recipes to Pair with Your Flag Barware
- Match Your Mani to Their Coat: Nail Art Ideas Inspired by Luxury Dog Jackets
- Cheap Electric Bikes as Big-Kid Gifts: Is the $231 AliExpress E-Bike a Good Family Purchase?
- Buyers' Guide: Best Bluetooth Portable Speakers for the Trunk, Tailgate, and Road Trips
Related Topics
Unknown
Contributor
Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.
Up Next
More stories handpicked for you
Testing Node.js APIs Against Android Skin Fragmentation: A Practical Checklist
Continuous Verification for Database Performance: Applying Software Verification Techniques to DB Migrations
How to Trim Your Developer Stack Without Slowing Innovation: Policies for Evaluating New Tools
Integrating ClickHouse for Analytics on Top of MongoDB: ETL Patterns and Latency Considerations
Security in Decentralized Data Centers: Protecting MongoDB Deployments
From Our Network
Trending stories across our publication group
Hardening Social Platform Authentication: Lessons from the Facebook Password Surge
Mini-Hackathon Kit: Build a Warehouse Automation Microapp in 24 Hours
Integrating Local Browser AI with Enterprise Authentication: Patterns and Pitfalls
