Challenges
The CIM was designed and implemented to address a number of practical
challenges derived from the authors’ collective experiences
participating, on the one hand, in research projects of various TRLs
and, on the other, in private collaborations with real-world
stakeholders spanning multiple application domains. Figure 1 depicts a
high-level view of the CIM and its capabilities. The difficulties and
how the CIM addresses them are presented in the following subsections:
decentralising DR systems, security and privacy, data validation, and
semantic interoperability.
Fig. 1. CIM architecture.
Decentralising DR systems
DR systems rely on many components, which can be broadly classed as
services or devices. The former often rely on data supplied by the
latter to calculate some findings that are turned into actions to be
conducted, such as anticipating the load of an infrastructure or
providing DR signals for the components. Devices are typically, but not
always, IoT devices that gather client data and conduct actions on the
energy infrastructure; for example, when a DR signal is received, a
device’s behavior can be modified.
Because of scalability concerns, the rising prevalence of IoT devices,
which are important data sources in this context, renders DR designs
that rely on centralized servers for data collection unsuitable in
practice. Furthermore, in modern systems, devices play a more active
role, as they must respond to signals received by other architectural
components, such as DERs receiving control signals from utilities.
As IoT devices do not have public endpoints, there is a clear demand for
the construction of a distributed and scalable communication layer that
supports duplex message exchanges. Furthermore, the communication layer
must offer service liveness, which simply means that it is
fault-tolerant to failures such as servers failing. Finally, the
communication layer should provide enterprise-grade security while
simultaneously adhering to data protection standards such as GDPR.
Cloud computing enables a diverse set of services, meeting the needs of
both devices and services [43]. It enables the distribution of
devices or services among large groups of networked distant servers,
using the processing power required by these components in centralized
designs. Because a huge quantity of data is exchanged in the context of
DR, the ideal cloud solution is the use of networks based on the
eXtensible Messaging and Presence Protocol (XMPP protocol) [44].
The fundamental benefit of using XMPP as the default communication
protocol is that it is a well- established and standardised protocol
developed for real-time data streams, with several open-source apps for
clients and servers and support for a variety of operating systems
[45]. Furthermore, cloud networks based on this protocol have
various security features in place to secure communications, including
as SASL for authentication and TLS for data encryption.
However, transitioning from a centralized to a decentralized
architecture based on XMPP networks is not an easy operation. Adopting
this strategy necessitates the extension of centralized DR systems’
technical
stacks in order to first construct the XMPP network and, second,
exchange data across the network. To do this, the CIM may be used as a
middleware to exchange data across an XMPP cloud. Existing systems do
not need to adopt a new technical stack when using CIM.
Security and privacy
For data consumption, most DR ideas rely on the cloud. These approaches
have numerous disadvantages in terms of security and privacy. On the one
hand, the various components of a DR system must publicly expose their
data exchange end points, which exposes possible access opportunities
for attackers. Furthermore, because the cloud is employed as a vast data
repository, the data contained therein is also accessible. However,
because all data is housed in a third-party data store that is not the
original system that created the data, the data owner also becomes the
platform where it is kept. Furthermore, because all data is centralized,
it becomes impossible to maintain the data.
Decentralized XMPP networks, on the other hand, provide numerous
security levels for both joining the network and transferring data.
Furthermore, because the components that engage in such networks are
hidden behind XMPP clients, the only method to communicate with them is
through a nonpublic network. As a result, these networks are an
excellent alternative for decentralizing disaster recovery systems that
automatically become more secure against external attackers.
Different XMPP clients may have their own decentralized access control
list policies in order to protect their privacy, while XMPP servers can
cluster the clients and establish extra access control list policies for
transferring data. Furthermore, any data sent via the XMPP network may
be encrypted end-to-end.
The CIM enables for the connection to an XMPP cloud, and to join the
CIMs, an SASL certificate provided by a network administrator is
required. Furthermore, the CIM encrypts all communication over the XMPP
network with a TLS certificate. As an extra privacy layer, the CIM
employs a white access control list (ACL) technique to prohibit requests
from other CIMs in the XMPP cloud from being handled.
Finally, another critical security issue is the CIM’s connectivity with
local data sources, namely the DR components. Although this
communication takes place in a trusted local infrastructure, the CIM
requires these infrastructures to employ an authentication method to
engage with it, enhancing security not just in the cloud but also in the
local infrastructure.
Semantic interoperability
As previously stated, semantic interoperability is defined as systems’
ability to transparently exchange data and have a shared understanding
of it [41], which results in the ability to transparently consume
the transferred data. Once the data interchange layer, in this example
XMPP, has been established, the common understanding must be
established. A frequent way to this purpose is to develop a common
ontology and communicate data represented in accordance with it
[23]. The usage of ontologies provides several benefits in terms of
interoperability; however, the data shared must be in RDF format.
There are several information sources in the context of DR, ranging from
devices (IoT) to data given via ad hoc DR components and/or databases.
The heterogeneity of the DR standards landscape has already been
described in the related work; however, from a practical standpoint,
this heterogeneity is even greater due to the various devices that rely
on a wide range of formats and models established by their vendors,
which are not even DR standards.
An active challenge is to have non-RDF data sources (DR systems, IoT
devices, etc.) and provide a transparent mechanism for translating their
data into an equivalent RDF version modelled according to a common
ontology so that the data is understandable by other actors involved in
the data exchange who expect data to be in RDF using the agreed
ontology. Uplift refers to the process of converting data on the fly
into an equivalent RDF version.
Furthermore, there is the inverse issue. When data is re-queried, a
semantically compatible version is returned. However, because DR systems
rely on diverse formats and models, the interoperable data must be
converted back into a different format and model that the system
supports. Downlift refers to the process of converting data from RDF to
an equivalent non-RDF form.
To that end, the CIM includes Uplifting and Downlifting mechanisms,
which allow DR systems to translate their data before sending it to the
XMPP cloud so that third-party entities can understand it, as well as
adapt data from the XMPP network to their own formats and models so that
it can be consumed transparently.
Data validation
Data in real-world systems is rarely limited within the tight confines
in which it is recorded. As previously said, contemporary systems
require interactions among different components that share data in order
to execute various functions. When a data payload is received, the
initial responsibility of each component is to validate the message
received.
Validation is classified into two types: (1) syntactic validation, which
enforces the right syntax of the data (e.g., assuring the correct
JSON-LD syntax14); and (2) semantic validation, which validates that the
data is consistent and meets a set of requirements.