2.2.4 Reaction representation

Reaction representation directly affects the accuracy of the model prediction. In this work, a new condensed hypergraph reaction descriptor is proposed consisting of graph adjacency matrix and feature matrix. Graph adjacency matrix is used to describe the connection of nodes. It consists of reaction mapping graph, reaction agent graph, molecular/reaction node and reaction summary node (Fig.1(b) ). Reaction mapping graph is constructed by the union of reactants and products graph. It considers the reaction atom transfers which provides more features for model prediction than directly splicing of molecules descriptors. Reaction agent graph uses graph data structure to represent reaction agents. In order to consider the influence between reactions and reaction agents, molecules and reaction nodes are connected to each other via bidirectional edges, enabling the message passing between reaction and agent graphs. Accompanied by bidirectional edge connections of reaction nodes and molecular nodes, the message transfer between reaction mapping graph and agent graphs are achieved. Accompanied by uni-directional edge connections of molecules and reaction nodes to reaction summary nodes, reaction features are aggregated from molecules to the whole reaction.
Node feature matrix is composed of atomic features which can be classified as indicators and change sites. Indicator sites primarily display the type of atoms, while change sites integrate the features from reactants and products to illustrate the changes in atom properties during the reactions. Edge feature matrix is composed of bond features. It is exclusively constructed by the change sites. By comparing of the bond features of the reactants and products, the changes of bond properties become clear, allowing for a clear differentiation between the directions of reversible reactions. The type of atom features and bond features are listed in Table 1 .