An Event Collection Framework for IoT

In sustAGE, event persistence, -collection and -postprocessing are critical activities in achieving the desired platform capabilities. In addition, as compliance with best practices and industry standards is desired, the platform’s design is based on cutting-edge, industry-ready technologies and best practices from the field.
An overview of the main IoT Architecture in sustAGE is shown in the figure below.

Figure: Cloud-based streaming analytics and event monitoring IoT framework

1.1 sustAGE Bridge

The sustAGE Bridge is a component that receives data gathered from all sensors of the IoT ecosystem (both raw data from sensors and data outputs from the modules of the Monitoring Layer) and forwards them for further processing via the Universal Messaging component. The sustAGE Bridge functionality relies on open-source software and provides a modular, adaptive and secure IoT gateway with built-in functionalities, supporting different communication protocols and providing an admin interface for performing essential monitoring and maintenance functions.

The sustAGE bridge is the base component that transforms raw data such as body indicators, telemetry, location tracking, or environmental conditions into events. It interconnects the IoT ecosystem through a secure gateway, storing and forwarding events for further processing to the Messaging Bus.

1.2 Messaging Bus

The Messaging Bus is a reliable and secure pub/sub mechanism that ensures real-time event transport. In the set-up of sustAGE, the Universal Messaging Platform from Software AG is used for the Messaging Bus functionalities as it provides out of the box adapters and a user-friendly interface for configuration. Implementing the Messaging Bus in sustAGE is based on asynchronous messaging, strengthening the decoupling of the applications and allowing integrators to choose between broadcasting messages to multiple receivers, routing a message to one of many receivers, or other topologies. Thus, messages are sent from sustAGE modules without knowing if the receiving module is up and ready. In the same way, a module or application that was not connected to the messaging system for whatever reason might receive a message, which was sent some time ago right after it successfully reconnects to the messaging system.

1.3 Complex Event Processing (CEP)

The CEP is attached to the Messaging Bus, consuming base events and computing aggregates, alarms or higher-level events, which are returned to the messaging bus again for further processing by other modules. Complex event processing is neither a scheduling technology nor a decision tool but a re-active solution that always needs an external trigger. Respective rules are hereby defined at design time in EPL, an SQL-like dialect. Thus it doesn’t allow for a perpetual analysis but executes observations in a defined and well-prescribed time-window.

Within sustAGE, the streaming analytics engine called APAMA is being used which provides a suitableenvironment to express pattern matching logic. With discrete event matching the software is typically looking for individual events occurring within a time period and in a specific order. For example, “Event A being received, followed by Event B, and this happening within 1 minute“. Streaming aggregation allows a performance analysis on a stream window to search for calculations that work across a window of data points, e.g., “Average value of all Event As across a window of 24 hours“. A set of aggregate functions are provided, which have been substantially extended to the use case scenarios within sustAGE

1.4 Time Series DB: InfluxDB and Telegraf

For data collection and persistence, a time-series database (influxDB) along with an event sniffer (Telegraf) are deployed which continuously listens to passing events and stores them with very short delay.

In contrary to a relational database, time-series databases are optimized for times series data which are characterized as simply measurements or events that are tracked, monitored, and aggregated over time. In the context of sustAGE – as in most IoT projects – all events are time-related and carry a timestamp. A time-series database is the most suitable solution in such cases as it is designed to handle metrics and events with a timely context and is optimized to measure changes over time.

In sustAGE, influxDB1 is used to store essentially all events passing the messaging bus. In addition, influxDB offers a “Data Explorer” – a simple viewer in influxDB that allows simple exploration of data using a graphical query builder and the Influx Query Language (influxQL) for building more complex queries. Continuous data collection is ensured using the open-source server agent called Telegraf2 , which collects all sustAGE events and persists them in the influxDB.

1.5 Query Acceleration using Dremio

Dremio3 is a data lake engine that supports the management of data and speeds up data queries. It permits data access from different “angles” as e.g., data science, analytic experts, business level decision-makers need appropriate data views according to their distinct user needs. Dremio is capable to join data from various sources and relate it to external data sources for advanced analytics. Its semantic layer of data views enables data management, sharing, and data curation while preserving security aspects. It further allows data migration features such as replication of legacy JDBC protocols to speed up data migration.