Multi-protocol bridge generation for M2M communication using MQTT

MQTT (Message Queuing Telemetry Transport) is known as a quasi-standard for IoT application communication between IoT devices and OPC UA (Open Platform Communication Unified Architecture) is the de facto standard that enables M2M communication in modern industry 4.0. But there are also industrial machines that still use RESTful (Representational State Transfer) APIs as their only communication interface. Integrating machines with these three protocols into automated manufacturing processes and enact a machine to machine communication between them can be challenging. This paper demonstrates how a M2M communication supporting HTTP, MQTT and OPC UA can be implemented using MQTT as the communication protocol. Furthermore, the goal is to create a model for each bridge that can be used to create interface mappings from MQTT to HTTP or from MQTT to OPC UA. This model can then easily be used to generate MQTT clients that bridge the different protocols and enable the M2M communication between machines to share their information.


Introduction
Automating manufacturing processes is a big goal for small, medium sized and big manufacturers. This goal has a wide range of benefits which include more productivity, better material use, better product quality, shorter factory lead times and many more. But integrating industrial machines into an automated process is not an easy task, especially with the wide range of available communication standards and protocols.
When automating machines in a manufacturing environment there is a big focus on event-based communication. The information the machines provide is important to control processes, optimize workflows, provide transparency, and much more.
To enable M2M communication between machines with different protocols there are often used programs that understand multiple protocols. But these multi-protocol middleware are often limited to a set of two or sometimes three different protocols. This is due to the fact that the difference between some of them can be quite large.
So instead of introducing multiple middleware platforms into an environment, to try to enable a M2M communication, another possible solution is to introduce little programs that translate one protocol to another and used them wherever needed and introduce one base communication protocol that is used by all machines.
These little translators are often called application layer bridges or just bridges. The idea of bridging multiple protocols to integrate devices into an environment is not new [16]. There a several proposals especially for IoT protocols. For instance, idea of the QEST broker [1], which is to create a middleware that supports REST and MQTT. Although this project had great

Organisation
The paper is organized as follows. The following section give a brief introduction and into HTTP and MQTT as well as a comparison of both, followed by OPC UA. Section 3 discusses the mapping from MQTT to HTTP followed by section 4 that discusses the mapping from MQTT to OPC UA and section 5 with a conclusion.

Different protocol paradigms
HTTP [2] and MQTT [3] differ a lot regarding their intended usage. HTTP is based on the client/server pattern, where the client sends a request to the server and latter sends a response back to the client. The communication is tightly coupled and synchronous. There are a few approaches to make HTTP more event-based for example the usage of long-polling and webhooks, which are also proposed to be the best choice for pub/sub like approach in [13]. But these two techniques do require a server that offers support for these more event-based messaging approaches. Other than this the only way to get data periodically from a HTTP endpoint is to implement a polling mechanism in the client, which can be a feasible approach sometimes.
MQTT on the other hand is built with the publish/subscribe pattern in mind and offers asynchronous messaging, which further allows to decouple the communication.
With the latest MQTT version 5.0 [4], the standard got a new feature that adds the support for the request/response pattern used by HTTP. This feature is the addition of the response topic paired with the correlation data.
The response topic field can be added to an MQTT message to specify topics to which the response of the receivers of the message should be sent. The correlation data field can be used to identify a particular request-response pair but can be omitted. The response topics can be used without the correlation data.
Another features which is like HTTP, is the addition of custom header fields in the MQTT header. Using this feature, it is possible to define header fields (key-value-pairs) to add metadata information to the MQTT message.
However, these new features introduced in MQTT 5.0 are not mandatory to create a mapping from MQTT to HTTP. It is also possible to implement these features using MQTT version 3.1.

Defining a main focus for OPC UA
OPC UA is a framework that has a lot more features than MQTT. Its purpose is more to model information of machines and provide services to work with this data rather than just provide a protocol for transmitting data. Although we will use feature that these two have in common which is the publish/subscribe pattern introduces with the OPC UA specification part 14 [13]. The information of an OPC UA server is stored in the information model [9], which is nothing other than a graph of nodes defined in the address space [7] of the server. There are ten service sets [8] with services to manage the address space of an OPC UA server.
There are a lot of different node classes already defined in the standard address space model, but the most important ones in for information retrieval are those which hold the actual information.
The OPC UA framework is a much more complex than MQTT. A complete mapping from MQTT to OPC UA would be too extensive for this work.
The focus here will be to get information from the server's address space, write and invoke remote procedure calls with function nodes within the server.

MQTT to HTTP mapping
After introducing MQTT and HTTP the next step is to look at the protocols to extract the properties needed to build such a mapping. A MQTT client that bridges the information from a MQTT broker to a HTTP server must map the messages from subscriptions to a HTTP endpoint. The corresponding response of an endpoint request then must be published back to the MQTT broker.

General mapping
The two parts of HTTP requests are the request header and (in most cases) a request body. Since a HTTP endpoint is uniquely defined by its URI and HTTP method, these two properties are essential to create a topic mapping. If the HTTP endpoint is freely definable and not fixed, it would be obvious to create a 1:1 mapping. This mapping could look like as follows: Topic: api/job/start/POST HTTP endpoint: POST /api/job/start HTTP/1.1 But usually the endpoints of machine interfaces are defined by their manufacturers and any mapping must be possible.
The header fields could be either be included within the body of the MQTT message in a separate header object or added to the MQTT message header when using version 5.0.

URL and query parameters
There are obvious parts of many HTTP request URLs that cannot be mapped 1:1 to a topic and that are parameters. One possible solution for this would be to treat them exactly as the fields of the HTTP header. In this cause, the order for variables in the URL is important. The query parameters are named anyway and can be included into a separate object of the MQTT message for version 3 { "Content-Type": "application/json", "Accept": "application/json", }, "queryparams": ["v1","0001"], "urlparams": { "id": "user", "pw": "1234", }, "payload": { "programNo": 4771 } }

HTTP Response handling
In the related work are few to none mentioning about the response data to a corresponding request. We propose two ways to handle the HTTP response to not lose the valuable data from the endpoint.

Using the response topic.
With the use of the response topic feature the response coming from the endpoint can be easily mapped to given topics. The disadvantage of this approach is that the MQTT message must contain the whole data from the HTTP response and the distinction between different status codes and other values would be a part of the subscribing client, which is normal when handling HTTP responses.

HTTP status code mapping.
A more granular approach is to create a switch statement over the HTTP status code. With a mapping from the HTTP status code to a specific topic(s) it is possible to delegate the data of the response more precisely. For instance, it is possible to create a mapping to an explicit topic for a "200 OK" response, so that a client, that only needs to know whether a request to an endpoint was successful, will only receive this required information. Other status codes, that for instance indicate an error, could still be mapped to a specific error topic.
An advantage of this approach is that the topics can be used more precise and it enables a grouping of status codes or a default/fall through case for a topic to which all other response would be sent to. Also, the data transmitted can be reduced by applying a filter to these mappings. In the filter single fields (or the whole body) could be specified to be dropped. Sometimes the event of receiving a message can be information enough. The idea of a filter could also be used for the first suggested option, the response topic.
On the other hand, the topics for the different response codes all need to be specified first. There are over 40 default status codes defined for HTTP/1.1 and it is possible to define custom ones as well. Figure 1 describes a mapping of a response status code. The MachineManager publishes a message to the topic Manager/Machines/0001/Start, which the Rest2Mqtt bridge is subscribed to. On receiving this message, the bridges start a request to the mapped HTTP endpoint. The response is then evaluated and depending on the status code a message is published to the mapped topic.

Authentication
The HTTP authentication is a part of the HTTP header and could be added to each endpoint individually. To remove this redundancy, a set of common header fields could be defined, which is applied to each individual endpoint that needs the authentication credentials.
But these authentication credentials are not always known at the start. To be more dynamic an extra endpoint could be defined that needs to be called first if the credentials are not supplied. The authentication credentials then could be added to the common header fields.  Figure 2 shows a model that holds all the information needed to create a client which acts as a bridge from MQTT to HTTP. The HttpEndpointDefinition is a representation of the information needed to make a HTTP request to the server. The MqttEndpointDefinition is used to describe a message with its properties and content as well as the topic. Some of the fields are only relevant when publishing a message. An EndpointMapping represents the mapping from an incoming (subscription to a topic) MQTT message to a HTTP request made to the HTTP server. Additionally, the Responses represent a mapping from a predicate applied to the status code of the response to a list of MqttEndpointDefinition to publish the result to. With a present response topic, in the header of the MQTT message, this can be optional.

MQTT to OPC UA mapping
A MQTT client that bridges the information between a MQTT broker and an OPC UA server must map the incoming messages from subscriptions to an appropriate action on the server's address space. The action type can either be put into the body of the MQTT message or added to the MQTT header as metadata. Depending on the action a result must be published back to the MQTT broker.

Information gathering
The OPC UA service sets, which are responsible for gathering information from the server's address space, are Attribute for reading and writing data from and to nodes, Method for making remote procedure calls and Subscription and MonitoredItem for creating and managing subscriptions. These are the service sets we will focus our mapping on.

Variable service mapping
The Variable service set offers services to read and write data from and to nodes. To get the value of a node the only thing the bridge needs to know is the node identifier. The message the bridge needs in order to read data from the address space could look like following: Topic: Machines/0002/State { "action": "Read", "nodeId": "ns=1;id=1234" } The action signals a read operation and the nodeId is the unique node identifier in the address space. The received value can then be published on to a specified topic or a list of topics.
To write a value to a node in the OPC UA address space, we need also supply the value that should be written. Otherwise the message is like the one used to read the data.

Subscription service mapping
Creating a subscription to get a value of a variable whenever it changes or get notified through event that happened on the OPC UA server also requires a node identifier. Additionally, an interval can be set as well as a flag to enable or disable the subscription. The bridge subscribes to the specified node in the OPC UA servers address space and gets notified, whenever there is a change of the node. The received value can then be published on to a specified topic. This can either be topics mapped in the data model or response topics set in the MQTT header.

Method service mapping
The last service can be compared to the HTTP request response pattern. Sending a request to an endpoint and compute the result is a lot like making a remote procedure call. The result is not always important and often you do not care for it at all.
The bridge subscribes to an incoming topic with the node identifier and the necessary parameters to call function. The result could then be published as a MQTT message to an outgoing topic. In the case the programming language the bridge is implemented in has problems with the argument type conversion it is possible to pass the argument type information as an additional array to supply information how to convert the values before making the call to the server. Figure 3 illustrates a mapping for a remote procedure call.
Topic: Machines/0002/Foo { "action": "Call", "nodeId": "ns=1;id=42", "args" : ["1", "4.5", "false"],  The AddressSpaceAction represents the information for the OPC UA client of the bridge. Each action has a node as a target and thus needs a node identifier. The additional fields needed for the different actions are expected to be supplied in the args dictionary. The EndpointMapping differs only by the absence of the status code predicate. Although it is possible to add a general predicate for responses.

Loose ends
There are lot OPC UA properties which were not covered here: the security and session management, historical data and especially the node management. For instance, to add connection information for the OPC UA server, the security policy, the message encryption and the user token type also need to be supplied in addition to the endpoint URL [22].

Conclusion
Using the approach of generating application layer bridges for single machines where they are needed can be a great advantage over introducing a whole multi-protocol middleware into an environment. But this clearly depends on the environment where machines need to be integrated in and the used protocols and standards.
Using the proposed models to create bridges makes integration of machines into a multi-protocol environment much easier and faster. Even more when using specifications like OpenApi [20] for RESTful APIs or the NodeSet specifications [21] for OPC UA address spaces to translate the data from the specifications into the models. The presented approach offers the possibility to connect machines easily that do not offer standard communications protocols for Industry 4.0 or smart factory, such as OPC UA or MQTT. Especially small and medium size companies can profit from the presented approach. Big companies will be able to connect their "non standardized" machines to the higher systems, like Enterprise Source Planning (ERP) systems or Manufacturing Execution Systems (MES). So, the presented solution helps companies to increase the automation of the entire production process.