Messaging infra: uses and pitfalls

Messaging infra: uses and pitfalls

Messaging infra: uses and pitfalls

“Now the release will swap the old deployment with the new one in just 2 to 3 minutes!” exclaimed our DevOps person, happy with his achievement! “Good achievement on the deployment side, but it doesn’t solve my issue”, the technical manager had some more concerns – “Any data loss is not acceptable – devices send 6 messages in a minute, 1-per-10-seconds! The daily deployment will cause a loss of 12 to 18 messages per device…”

Let’s leave them discussing it further and turn our attention to the missing piece in their IoT infrastructure: Messaging service! For 24×7 data continuity and zero tolerance on data loss, one must use a messaging service. It decouples the producer of the messages (the devices) from the consumer (the web service).  It provides a layer of indirection insulating them both from the unavailability of each other.

Typically a messaging service comprises queues, topics, event hubs, etc. The messaging service comes with its own storage for persisting the messages for up to a few days if not consumed by the web service.

While this acts as a breather for the availability and data loss issues, it’s not a silver bullet! You may still experience some problems like –

  • If the producer or consumer is not available for a long duration, the messaging infra will not help with real-time information flow to the stranded users of the system. The system may not give the latest measurements or fail to alert in time! Only when service or the device is restored, possibly through manual intervention, the back-processing will take place and notifications may be shown after the fact!
  • The messaging infra also has its own limits – the long absence of the consumer (the web service) will put a load on the messaging infra. The storage space may run out or the messages may expire.
  • Since messaging services are typically PaaS services on the cloud, an application designer tends to take their services for granted, guarded by the cloud provider’s SLA. As a result, error handling for the messaging service is ignored. But each messaging service has a way of separating out messages with errors from the other healthy and live messages. An application should have an error-handling workflow in order to ensure no-data-loss!
  • A structure like a queue is FIFO – if the service is down for a long time, the need of the hour when it’s back online is to process the latest messages, but it has to process older messages first before it can retrieve the current messages!

All-in-all, messaging infrastructure is a must in IoT systems. If we, as application designers, ‘read the terms and conditions carefully’ before designing a system, we can reap rich benefits in terms of availability and no data loss!

Do you have any experiences which you want to share here?

Share this post