The core platform layer provides a set of capabilities to connect, collect, monitor and control millions of devices. Let’s look at each of the components of the core platform layer.
An Enterprise IoT platform typically supports one protocol end to end like AMQP or MQTT as part of the overall stack. However in an IoT landscape, where there are no standardized protocols where all vendors can converge on, an Enterprise IoT stack needs to provide support for commonly used protocols, industry protocols, and support for evolving standards in future.
One option is to use a device gateway that we had discussed earlier to convert device proprietary protocols to the protocol supported by the platform for communication, but that may not always be possible as devices may connect directly or the device gateway may not support the protocol supported by the IoT stack. In order to support handling multiple protocols, a protocol gateway is used, which does the conversion of the protocol supported by your IoT stack. Building an abstraction like protocol gateway would make it easier to support different protocols in future.
The protocol gateway layer provides connectivity to the devices over protocols supported by the IoT stack. Typically the communication is channelized to the messaging platform or middleware like MQTT or AMQP. The protocol gateway layer can also act as a facade for supporting different protocols, performing conversions across protocols and handing off the implementation to corresponding IoT messaging platform.
Messaging middleware is a software or an appliance that allows senders (publishers) and receivers (consumer’s consuming the information) to distribute messages in a loosely coupled manner without physically connected to each other. A message middleware allows multiple consumers to receive messages from a single publisher, or a single consumer can send messages to multiple senders.
Messaging middleware is not a new concept; it has been used as a backbone for message communication between enterprise systems, as an integration pattern with various distributed systems in a unified way or inbuilt as part of the application server. In the context of IoT, the messaging middleware becomes a key capability providing a highly scalable, high-performance middleware to accommodate a vast number of ever-growing connected devices. Gartner, Inc. forecasts that 4.9 billion connected things will be in use in 2015 and will reach 25 billion by 2020.
The IoT messaging middleware platform needs to provide various device management aspects like registering the devices, enabling storage of device meta-model, secure connectivity for devices, storage of device data for specified interval and dashboards to view connected devices. The storage requirements imposed by an IoT platform is quite different from a traditional messaging platform as we are looking at terabytes of data from the connected devices and at the same time ensuring high performance and fault tolerance guarantees. The IoT messaging middleware platform typically holds the device data for the specified interval and in-turn a dedicated storage service is used to scale, compute and analyze information.
The device meta-model that we mentioned earlier is one of the key aspects for an Enterprise IoT application. A device meta-model can be visualized as a set of metadata about the device, parameters (input and output) and functions (send, receive) that a device may emit or consume. By designing the device meta-model as part of the IoT solution, you could create a generalized solution for each industry/verticals which abstracts out the dependency between data send by the devices and data used by the IoT platform and services that work on that data. You can even work on a virtual device using the device model and build and test the entire application, without physically connected the device. We would talk about the device meta-model in detail in the solution section.
All the capabilities of the IoT stack, starting from the messaging platform right up to the cognitive platform are typically made available as services over the cloud platform, which can be readily consumed to build IoT applications.
We would revisit the topic in detail in Chapter 3 when we talk about the IoT capabilities offered by popular cloud vendors and using open source software.
The data storage component deals with storage of continuous stream of data from devices. As mentioned earlier we are looking at a highly scalable storage service which can storage terabytes of data and enable faster retrieval of data. The data needs to be replicated across servers to ensure high availability and no single point of failure.
Typically a NoSQL database or high performance optimized storage is used for storing data. The design of data model and schema becomes a key to enable faster retrieval, perform computations and makes it easier for down processing systems, which use the data for analysis. For instance, if you are storing data from a connected car every minute, your data can be broken down in ids and values, the id field would not change for a connected car instance, but values would keep on changing – for instance like speed =60 km/hour, speed =65 km/hour, etc. In that case instead of storing key=value, you can store key=value1, value2 and key=timeint1, timeint2 and so on. The data represents a sequence of values for a specific attribute over a period of time. This concept is referred to as time series, and database, which supports these requirements, is called as Time Series database. You could use NoSQL databases like MongoDB or Cassandra and design your schema in a way that it is optimized for storing time series data and doing statistical computations. Not all use cases may require the use of time series database, but it is an important concept to keep in mind while designing IoT applications.
In the context of an enterprise IoT application, the data storage layer should also support storage and handling of unstructured and semi-structured information. The data from these sources (structured, unstructured and semi-structured) can be used to correlate and derive insights. For instance, information data from equipment manual (unstructured text of information) can be fed into the system, and sensor data from the connected equipment can be correlated with the equipment manuals for raising critical functional alerts and suggesting corrective measures. The IoT messaging middleware service usually provides a set of configurations to store the incoming data from the devices automatically into the specified storage service.
In a future article , I would talk about various storage service options provided by cloud providers for storing a massive amount of data from connected devices.
The IoT core platform deals with raw data coming from multiple devices, and not all data needs to be consumed and treated equally by your application. We talked about device gateway pattern earlier which can filter data before sending it over to the cloud platform. Your device gateway may not have the luxury and computation power to store and filter out volumes of data or be able to filter out all scenarios. In some solutions, the devices can directly connect to the platform without a device gateway. As part of your IoT application, you need to design this carefully as what data needs to be consumed and what data might not be relevant in that context. The data filter component could provide simple rules to complex conditions based on your data dependency graph to filter out the incoming data. The data mapper component is also used to convert raw data from the devices into an abstract data model which is used by rest of the components.
In some cases, you need to contextualize the device data with more information, like aggregating the current data from devices with existing asset management software to retrieve warranty information of physical devices or from a weather station for further analysis. That’s where the data aggregation components come in which allows to aggregate and enrich the incoming data. The aggregation component can also be part of the messaging stream processing framework that we will discuss in the next section, but instead of using complex flows for just aggregating information, this requirement could be easily handled without much overhead using simplified flows and custom coding, without using a stream processing infrastructure.