INTRODUCTION
On a global scale from an electron perspective, all organisms are
electronic half-cells, powered by circuits plugged into electron sources
and sinks in the environment [1-3]. For example, in aerobic
respiration, which is probably most familiar to us, as that is our
source of energy, the oxidation of organic matter leads to a flux of
electrons and protons through metabolic pathways to reduce oxygen to
water and CO2. This, like all metabolic pathways, is a
half-cell terms of chemical oxidation-reduction pathways. In the case of
aerobic respiration, the other half cell is oxygenic photosynthesis,
where sunlight is used to oxidize water and the electrons and protons
drive reduction of CO2 to organic matter. The voltage
potential between the anode (e.g., organic matter; in its simplest form,
sugars) and the cathode (e.g., oxygen) provides over 1 volt of energy.
That is the most energy available for life on this planet - but life
existed long before there was molecular oxygen.
In deep time, a set of enzymes evolved to facilitate electron transport
– the oxidoreductases or EC 1 proteins. Biological electronic circuits
require the movement of electrons over sub-nanometer distances through
an electron transfer chain that powers life. The movement of electrons
is governed by physical laws [4-6]. Oxidoreductases organize the
positions and relative energetics of chains of redox-active cofactors,
assuring the rapid, directional flow of electrons [7]. The energetic
tendency of a redox-active group to gain electron or lose electrons can
be experimentally measured as the redox potential, expressed in volts
(V), relative to a reference such as the standard hydrogen electrode, at
a standard pH. Redox-active groups that contribute to the redox
potential can be cofactors such as iron-sulfur clusters, hemes, or
flavins, or amino acid residues such as cysteine, methionine, or
tryptophan. The relative stability of cofactor oxidation states are
largely determined by the cofactor itself [8] but are further
modulated by the protein matrix. Electrostatic interactions, such as
proximity of positively charged basic amino acids, can stabilize a redox
cofactor in the reduced state [9, 10]. The protein can modulate
oxidation-reduction energetics through hydrogen bonding [11, 12],
hydration [13] and dynamical features [14] of the
protein-cofactor environment. Groups of oxidoreductases form metabolic
pathways, powering cellular-scale circuits where the current depends on
the rate of catalysis and diffusion of substrates [15]. It is
critical to study how the protein environment modulates the energetics
of oxidation-reduction reactions in order to understand how electron
transfer is coupled to metabolism.
The connection between oxidoreductase structure and energetics is
central to the deep-time evolution of metabolism. Oxidoreductases must
have been among the first proteins at the origins of life over 3.5
billion years ago providing the spark for metabolism [2, 16-20]. Due
to its fundamental electrical nature, the evolution of metabolism, and
the associated oxidoreductases, was strongly coupled with changes in the
redox state of the planet, which has become increasingly oxidized over
time due to both geochemical and biological processes [2, 21, 22].
Modern oxidoreductases are massive nanomachines – far too complex to
have arisen early in metabolism. Various structure-based bioinformatics
approaches have been applied to identify universal sub-folds or domains
within larger proteins that may have derived from early protein forms
[16, 17, 23-32]. In previous work focused on the evolution of
oxidoreductases, we found that modern, large enzymes were largely
derived from just a few minimal protein-cofactor building blocks [16,
17, 33]. In addition to identifying core cofactor binding folds, we
used a structure-derived criterion for electron transfer based on
cofactor-cofactor distances [7] to map a network of electron
transfer pathways between the different folds – which we refer to as
the Spatial Adjacency Network, SpAN. A notable feature of the SpAN was
the abundance of more reducing cofactor-binding folds in the network
center and more oxidizing cofactor-folds at the periphery [17]. This
suggests a time axis in the SpAN from the center to the periphery of the
network reflecting the adaptation of protein redox energetics to
emerging electron sources and sinks made available by an oxidizing
planetary environment over geologic time. Mapping quantitative estimates
of protein redox energetics onto the SpAN would allow us to potentially
constrain the age of various protein folds based on redox information in
the geologic record [2, 34, 35].
Computational approaches for prediction of redox energetics based on
protein structures is an ongoing challenge. Current methods span many
levels of theory from quantum-mechanical to empirical [36] and
recent advances using machine learning [37]. Site-directed
mutagenesis studies on natural oxidoreductases [38-40] and protein
engineering [41-44] have been used to test molecular hypothesis of
how the protein environment tunes redox energetics. Large datasets of
protein structures, including oxidoreductases, are on the horizon with
advances in functional annotation from genomic and metagenomic datasets
[20, 45] combined with recent advances in structure prediction
[46-48] including bound cofactors [49]. Effective models that
can predict redox energetics based on structural information will become
increasingly valuable for understanding bioenergetics, evolution of
metabolism and engineering of bioelectronic pathways [42, 50].
Motivated by the need to design and train better models and the goal of
mapping redox energetics onto the SpAN to study oxidoreductase
evolution, we develop ProtReDox, a manually curated database of protein
redox potentials. We examined literature reports of oxidoreductase
energetics and identified the cofactor type, redox potential, UniProt
and PDB (if available) identifiers, and experimental metadata such as
potentiometric measurement technique, pH and buffer conditions.
ProtReDox version one is available athttps://protein-redox-potential.web.app.
We apply this dataset to explore how redox energetics is modulated by
cofactor-type, protein environment, experimental conditions and finally
how energetics mapped onto the SpAN inform geochemical constraints on
deep-time oxidoreductase evolution.