Abstract: We present a program
for the prediction of biological activity
spectra for drug-like organic substances. New lead compounds can be found on the
basis of predicted biological activity spectra. In house and Internet versions
of the PASS program are discussed.
Keywords: biological activity spectrum, computer-aided
prediction, computer system PASS (Prediction of Activity Spectra for
Substances), applications in computer-aided drug discovery, prediction via
Internet.
Introduction
Most of known biologically active substances have many
different biological activities that cause both main (therapeutic) and
supplementary (side) actions. Some of these activities are found during the
initial preclinical study; others are found unfortunately too late in clinical
trials. Sometimes, many years after the first
launch of a drug additional activities
are discovered that become
the base for a new therapeutic application.
Most computer-aided
drug-discovery methods are used to study a single, or only a few activities of a
compound class. [1-5]. A program that
predicts simultaneously pharmacological effects, mechanisms, and specific toxicities on the
basis of the 2D chemical structure is the tool of choice to get an early
indication if a compound could be a potential
lead.
Victor Avidon proposed this idea more than 35 years ago [6, 7].
Tis technology has
been formerly developed and tested on new chemical compounds synthesized in the USSR [8, 9]
in the framework of the national registration
system of the UDSSR.
The program was revised several times. The theoretical analysis went
through several approaches and the accumulated experience of finding new leads
allows constant improvements [10-14].
-1-
|
The PASS team is permanently collecting and evaluating
the information about new pharmaceutical substances and lead compounds,
to update the PASS training set and extend PASS predictive abilities on
new chemical classes and novel biological activities:
,
Figure 1. Increase of the number of
compounds over years that are abstracted for the knowledge base
|
Figure 2. Increase of the number of predictable
activities over years
The current version of PASS predicts ca.
to 4130 pharmacological
effects, mechanisms
of action, and other effects, see Table 1, to the right. [15].
We provide a list of
activities (present list is not complete).
In the following we show the methods used in PASS, how you can
evaluate PASS yourself by using it on the Internet, or as evaluation version,
and examples of practical applications.
-2-
|
Number |
Area |
Examples |
261 |
pharmacotherapeutic actions |
Anxiolytic |
66 |
anti-infective actions |
Antileishmanial |
72 |
actions blocking a certain process |
Apoptosis antagonist |
40 |
actions stimulated a certain process |
Apoptosis agonist |
140 |
actions blocking activity of certain
endogenous substance |
Acetylcholine antagonist |
71 |
actions simulating activity of certain
endogenous substance |
Acetylcholine agonist |
5 |
actions blocking a release of a
certain endogenous substance |
Cytochrome C release inhibitor |
9 |
actions stimulating a release of a
certain endogenous substance |
Acetylcholine release stimulant |
9 |
actions blocking an uptake of a
certain endogenous substance |
Adenosine uptake inhibitor |
2219 |
actions inhibiting a certain enzyme |
12 Lipoxygenase inhibitor |
41 |
actions stimulating action of a
certain enzyme |
ATPase stimulant |
268 |
actions blocking a certain receptor |
5 Hydroxytrypamine 1 agonist
|
121 |
actions stimulating a certain receptor |
5 Hydroxytrypamine 1 antagonist |
28 |
actions blocking a certain channel |
Chloride channel antagonist |
5 |
actions stimulating a certain channel |
Calcium channel agonist |
28 |
actions blocking a certain transporter |
GABA transporter 1 inhibitor |
128 |
actions that is a substrate of a
certain metabolic enzyme |
CYP3A4 substrate |
24 |
actions inhibiting a certain metabolic
enzyme ( |
,
CYP3A4 inhibitor |
13 |
actions inducing a certain metabolic
enzyme |
CYP3A4 inducer |
28 |
actions inhibiting a certain protein |
Collagen inhibitor |
8 |
actions inhibiting an expression of a
certain transcription factor |
Transcription factor Rho inhibitor |
2 |
actions stimulating an expression of a
certain transcription factor |
TP53 expression enhancer |
389 |
actions that cause a certain
adverse/toxic effect |
Carcinogen |
Table 1. List of biological effects |
Presentation
of biological activities in PASS
Let's define biological activity as the result of a compound's
interaction with an biological entity. In clinical studies the entity is the human organism. In preclinical testing it
can be animals (in vivo) or experimental models (in vitro). The biological activity
depends on a compound's structure, charge distribution, physico-chemical properties, and more. The activity depends on
the biological entity (species, sex, age, etc.), on the mode of treatment
(dose, route), etc. Any biologically active compound reveals a wide spectrum of
different effects. Some of them are useful in treatment of diseases but others cause various side and toxic effects.
All activities
caused by the compound are considered to be the "biological
activity spectrum of the substance".
If the experimental conditions cannot be defined narrowly, i.e if
the difference in species, sex, age, dose, route, etc. is
neglected the biological activity can be identified only qualitatively
(“yes’/“none”, “active”/“inactive”). Thus, the "biological activity spectrum" is defined as
an
"intrinsic" property of a compound depending only on its structure and
physico-chemical characteristics. A qualitative presentation allows integrating
information concerning biologically active compounds that were collected from
many different sources for the general PASS training set. Any property of
chemical compounds, which is determined by their structural peculiarities, can
be used for prediction by PASS. It was shown, that the applicability of PASS is
broader than the prediction of biological activities. For instance, this
approach was successfully used for prediction of such general property of
organic molecules as drug-likeness (Anzali et al., 2001).
Chemical structure description
in PASS.
The 2D structure of compounds is
chosen as the basis for the description of the chemical structure because this
is the only information available at the early stage of research. Thus, using
the structural formula as input data, one can obtain the estimates of biological
activity profiles even for virtual molecules, prior to their chemical synthesis
and biological testing.
Many different characteristics of
chemical compounds can be calculated on the basis of the 2D structure. In the
earliest versions of PASS (Poroikov et al., 1993; Filimonov et al., 1995;
Filimonov and Poroikov, 1996) used the Substructure Superposition Fragment
Notation (SSFN) codes (Avidon et al., 1982). However, SSFN, like many other
structural descriptors, reflects rather abstraction of chemical structure by the
human than the nature of ligand-target interactions, which are the molecular
mechanisms of biological activities.
-3- |
The Multilevel Neighbourhoods
of Atoms (MNA) descriptors (Filimonov et al., 1999) have certain
advantages in comparison with SSFN. These descriptors are based on the
molecular structure representation, which includes the hydrogen atoms
according to the valences and partial charges of other atoms and does
not specify the types of bonds. MNA descriptors are generated as
recursively defined sequence:
-
zero-level MNA descriptor
for each atom is the notation A of the atom itself;
-
any next-level MNA descriptor for the atom is the
sub-structure notation A(D1D2....Di
the previous-level MNA descriptor for i–th immediate
neighbour’s of the atom A.
The notation A of the
atom may include not only the atomic type but also any additional
information about the atom. In particular, if the atom is not included
into the ring, it is marked by “-”. The neighbor descriptors D1D2....Di
are arranged in unique
lexicographic order. Iterative process of MNA descriptors generation can
be continued covering first, second, etc. neighborhoods of each atom.
The molecular structure is
represented in PASS by the set of unique MNA descriptors of the 1st and
2nd levels (Figure 3). The substances are considered to be equivalent in
PASS if they have the same set of MNA descriptors. Since MNA descriptors
do not represent the stereochemical peculiarities of a molecule, the
substances whose structures differ only stereochemically, are formally
considered as equivalent.
HC |
C(C(CC—H)C(CC—C)—H(C)) |
HO |
C(C(CC—H)C(CN—H)—H(C)) |
CHCC |
C(C(CC—H)C(CN—H)—C(C—O—O)) |
CHCN |
C(C(CC—H)N(CC)—H(C)) |
CCCC |
C(C(CC—C)N(CC)—H(C)) |
CCOO |
N(C(CN—H)C(CN—H)) |
NCC |
—H(C(CC—H)) |
OHC |
—H(C(CN—H)) |
OC |
—H(—O(—H—C)) |
|
—C(C(CC—C)—O(—H—C)—O(—C)) |
|
—O(—H(—O)—C(C—O—O)) |
|
—O(—C(C—O—O)) |
Figure 3. Structural
formula of nicotinic acid and its MNA descriptors of the 1st (left
column) and 2nd (right column) levels
New QNA (Quantitative
Neighbourhoods of Atoms) descriptors were recently developed, which
allow the analysis of quantitative structure-activity relationships (Filimonov
et al., 2009). |
Mathematical Approach
The PASS algorithm of the biological
activity spectrum prediction is based on Bayesian estimates of probabilities of
molecule’s belonging to the classes of active and inactive compounds,
respectively. The mathematical method is described in several publications (Lagunin
et al., 2000; Stepanchikova et al., 2003; Poroikov and Filimonov, 2005;
Filimonov and Poroikov, 2006; Filimonov and Poroikov, 2008), and its details
will not be discussed here.
Since the main purpose of PASS is the
prediction of activity spectra for new molecule, the general principle of the
PASS algorithm is the exclusion from SAR Base (knowledge base) the substances,
which are equivalent to the substance under prediction.
The structurefor which teh PASS
prediction should be carried out, is presented as a molfile (for the set of
molecules – as SDFile). The predicted activity spectrum is presented in PASS by
the list of activities with probabilities "to be active" Pa and "to be inactive"
Pi calculated for each activity (Figure 6). The list is arranged in descending
order of Pa-Pi; thus, the more probable activities appeare at the top of the
list. Only activities with Pa>Pi are considered as possible for a particular
compound. The list can be shortened at any desirable cutoff value, but Pa>Pi is
used by default. If the user chooses rather high value of Pa as a cutoff for the
selection of probable activities, the chance to confirm the predicted activities
by the experiment is high too, but many existing activities will be lost. For
instance, if Pa>90% is used as a cutoff, about 90% of real activities will be
lost; for Pa>80%, the portion of lost activities is 80%, etc.
It is necessary to keep in mind that
probability Pa reflects the similarity of molecule under prediction with the
structures of molecules, which are the most typical in a sub-set of “actives” in
the training set. Therefore, usually there is no direct correlation between the
Pa values and quantitative characteristics of activities.
Even an active and potent compound,
whose structure does not resemble the typical structures of “actives” from the
training set, may obtain a low Pa value during the prediction (even negative
Pa-Pi values could be observed). This may be explained by the way how the
appropriate estimates are constructed: the values Pa for “actives” and Pi for
“inactives” are distributed uniformly.
-4- |
Taking this into account, the
following interpretation of prediction results is possible. If, for
instance, Pa=0.9, then for 90% of “actives” from the training set the
appropriate estimates are less than for this compound, and only for 10%
of “actives” these values are higher. If one declines the suggestion
that this compound is active, he will make a wrong decision with 10%
probability .
If Pa > 0.7. the chance to find the activity experimentally
is high. But, in many cases the compound may occur to be a close analogue of
known pharmaceutical agents.
If 0.5 < Pa < 0.7 the chance to find the activity experimentally is less, but the compound is
probably not so similar to known pharmaceutical
agents.
If Pa < 0.5 more than half of
“actives” from the training are estimated to have a higher percentage chance to
have this activity. If one declines the suggestion that this compound is active,
he will make a wrong decision with probability less than 0.5. In such case the
probability to confirm this kind of activity in the experiment is small, but if
it will be confirmed,
more than 50% chances that this structure has NOT been reported with this
activity and might a valuable lead compound.
If the predicted biological
activity spectrum is wide, the structure of the compound is quite
simple, and does not contain peculiarities, which are responsible for
the selectivity of its biological action.
If it appears that the
structure under prediction contains several new MNA descriptors (in
comparison with the descriptors from the compounds of the training set),
then the structure has low similarity with any structure from the
training set, and the results of prediction should be considered as
rather rough estimates.
Based on these criteria, one
may choose which activities have to be tested for the studied compounds
on the basis of compromise between the novelty of expected
pharmacological action and the risk to obtain the negative result in
experimental testing. Certainly, one could also take into account a
particular interest to some kinds of activity, experimental facilities,
etc.
We have developed a special application
CWM Lead Finder which matches with
clustering algorithms the biological spectra of a set of compounds with
known biological activity and a set of untested compounds.
|
Mathematical Method
The accuracy and efficiency of more than 200 various mathematical
approaches were tested to select the most relevant algorithms [16].
One of the methods that provides a satisfactory quality of prediction is described
below in more details.
Definitions:
n is the total number of compounds in the training set;
ni is the number of compounds, that have the descriptor i;
nj is the number of compounds, that reveal the activity j;
nij is the number of compounds, that have both the descriptor i and the activity
j;
pj = nj/n is the estimate of a priori probability of activity
j;
pij = nij/ni is the estimate of the conditional probability of the activity j
for the descriptor i;
m is the number of descriptors for the compound under prediction;
ri = ni/(ni + 0.5/m) is a regulating factor;
Prj is the initial estimate of the probability of the
activity j for the compound under prediction;
CPj is the cutting point;
E1j(CPj) is the estimate of 1st kind error probability;
E2j(CPj) is the estimate of 2nd kind error probability;
The 1st kind error is observed when the compound under
prediction actually is active but Prj < CPj;
The 2nd kind error is observed when the compound under prediction is considered
as inactive but Prj > CPj.
LOO is the leave-one-out
procedure.
For each compound in the training set the values n,
ni, nj,
nij are changed to n-1, ni-1, and nj-1, nij-1 when it has activity j, and the
estimates Prj are calculated.
MEP is the maximal error of prediction (see below).
-5-
|
Algorithm of Prediction
Structural descriptors are generated for the compound under prediction.
The following values are calculated for each activity:
uj = SiArcSin{ri(2pij-1)},
vj = SiArcSin{ri(2pj-1)}
sj = Sin(uj/m), tj = Sin(vj/m)
Prj = (1+(sj-tj)/(1-sjtj))/2
Validation criteria: The LOO estimates of
Prj are calculated for each compound in the training set.
The estimates of E1j(CPj) and E2j(CPj) are calculated for each activity. The
cross point
E1j(CPj*) = E2j(CPj*)
are calculated. The maximal error of prediction MEP is:
MEPj = E1j(CPj*) = E2j(CPj*)
Results of the prediction:
The probability to be active is:
Pa = E1j(Prj)
The probability to be inactive is:
Pi = E2j(Prj)
The result for the prediction is presented as the list of
activities with appropriate Pa and Pi, sorted in descending order of the
difference (Pa-Pi)>0.
|
Process of PASS Development
Figure
4. The Process of PASS Development
-6- |
The Training Set
The current PASS training set
consists of about 270'000 of biologically active compounds, consisting of
already launched drugs, drug-candidates under clinical or advanced preclinical
testing. Since 1972 this training set is compiled from many sources including publications, patents, databases, private communications, etc. For the majority of compounds, included into the training set,
the biological activity spectrum of each compound was studied in detail.
In PASS Pro the customer can create easily his own training
set. A training set consists of a SDFile with the field activity_prediction.
This file is read into PASS. It takes about 5 minutes to read a training set of
1000 compounds.
Validation of PASS
The quality of prediction can be calculated by leave-one-out cross
validation (LOO CV). Each of the compounds is subsequently removed from the training set
and the prediction of its activity spectrum is carried out on the basis of the
remaining part of the training set. The result is compared to the known activity
of the compound, and the maximal error of prediction (MEP) is calculated, and
averaged over all compounds and activities.
Average accuracy of
prediction is about 95.3% according to the LOO CV estimation, while for
the different kinds of activity prediction accuracy varies from 70.7% (Antineoplastic,
Myeloid leukemia) to 99.9% (p21-activated kinase 1 inhibitor).
Accuracy of PASS
Prediction
The accuracy of PASS
predictions depends on several factors, from which the quality of the
training set seems to be the most important one. A perfect training set
should include the comprehensive information about biological activities
known or possible for each compound. In other words, the whole
biological activity spectrum should be thoroughly investigated for each
compound included into the PASS training set. Unfortunately, no database
exists with information about biologically active compounds tested
against each kind of biological activity. Therefore, the information
concerning known biological activities for any compound is always
incomplete.
|
We investigated the influence of the
information’s incompleteness on the prediction accuracy for new compounds. About
20000 “principal compounds” from MDDR database (SYMYX MDL) were used to create
the heterogeneous training and evaluation sets. At random 20, 40, 60, 80% of
information were excluded from the training set. Either structural data or
biological activity data were removed in two separate computer experiments. In
both cases it was shown that even if up to 60% of information is excluded, the
results of prediction are still satisfactory (Poroikov et al., 2000). Thus,
despite the incompleteness of information in the training set, PASS algorithm is
robust enough to get the reasonable results of prediction.
PASS predictions were performed for
about 250000 molecules from Open NCI database (Poroikov et al., 2003). This
information is presented at the NCI web-site (http://cactus.nci.nih.gov/ncidb2/)
in a searchable mode. One could combine different terms in a query using Boolean
operators. For example, with a query “Angiogenesis inhibitor AND Pa>0.9 AND
Pi<0.2 NOT acid NOT amide” we identified 85 hits. Seven compounds were tested in
NCI and four showed the Angiogenesis inhibitory activity at the approximately
10-100 µM level (Poroikov et al., 2003). Also, on the basis of results of
anti-HIV testing of compounds from the Open NCI database, we estimated that
using PASS predictions one could significantly (up to 17 times) increase the
fraction of “actives” in the selected sub-set (Poroikov et al., 2003).
-7- |
PASS on the
Internet
PASS INet service
(http://www.ibmc.msk.ru/PASS)
provides the possibility for any registered user to obtain PASS
predictions free-of-charge (Lagunin et al., 2000; Sadym et al., 2003;
Filimonov and Poroikov, 2006; Geronikaki et al., 2008a). The user
obtains the PASS predictions by submitting a molfile or drawing the
structure with a Marvin applet.
By January 1st, 2010 the
number of registered users exceeded 5000, and over 115000 predictions
were obtained. Based on the prediction results, the researchers select
the most prospective substances for chemical synthesis and biological
testing. Comparison of PASS prediction results from different chemical
series with various kinds of biological activity provides independent
validation. Currently, about thirty independent papers have been
published, where the coincidence of PASS predictions with the experiment
is described. For example, due to the PASS predictions, new
antileishmanial agents were found among the 2 substitution-bearing
6-nitro- and 6-amino-benzothiazoles (Delmas et al., 2002), 7-substituted
9-chloro and 9-amino-2-methoxyacridines (Di Giorgio et al., 2003), beta-carboline
alkaloids (Di Giorgio et al., 2004); new anxiolytics were found among
quinazolines (Goel et al., 2005), thiazoles, pyrazoles, isatins, a-fused
imidazoles and other chemical series (Geronikaki et al., 2004); new
anti-inflammatory agents were found among substituted amides and
hydrazides of dicarboxylic acids (Dolzhenko et al., 2003),
1-acylaminoalkyl-3,4-dialkoxybenzene derivatives (Labanauskas et al.,
2005); etc. (for review – see Geronikaki et al., 2008a).
|
Also, on the basis of PASS predictions new antihypertensive
and antiinflammatory agents with dual mechanisms of actions were discovered (Lagunin
et al., 2003; Geronikaki et al., 2008b), which demonstrated the capability of
PASS in finding multitargeted agents exhibiting additive/synergistic effects.
PASS applications for predicting biological activity spectra of organic
molecules including known drug substances are described in detail (Poroikov et
al., 2001; Poroikov and Filimonov, 2002; Poroikov et al., 2007).
PASS INet, however, does not provide the full functionality of
the commercial version of PASS. In particular earlier version of SAR Base is
implemented into PASS INet; this program predicts the smaller number of
biological activities; only single molecule using molfile as an input are
allowed. In the commercial version of PASS (Figure 5) SDFiles are used. Further
analysis of prediction results done with PharmaExpert.
Also, we provide continuous support for the commercial license
answering questions, and supplying the latest versions of PASS when such
versions appear.
-8- |
Figure 5.
PASS user interface and example of prediction results (displayed in a
graphic mode)
|
In the
commercial version of PASS the user can
evaluate the
contribution of each atom in a molecule to the required biological activity
(Figure 6).
The color of each atom depends on the contribution of the atom to
the activity.
Green
Pa = 1, Pi = 0
Red
Pa = 0, Pi = 1
Blue
Pa = 0, Pi = 0
Grey
Pa = 0.33, Pi = 0.33
Thus,
Green means
the positive impact of a particular fragment into the activity; Red means the
positive impact of a particular fragment into the activity; Blue and Grey mean
the neutral impact of a particular fragment into the activity. Based on this
information, medicinal chemist could modify the structure in order to increase
the probability of the desirable pharmacological activity or decrease the
probability of toxic action.
-9- |
Figure 6.
Influence of particular atoms in a molecule on a particular activity
(antihyper-tensive in this example). |
PharmaExpert
PharmaExpert
as a tool for analysis of PASS predictions. PharmaExpert (Poroikov et al., 2005;
PharmaExpert Program Package, 2006) was developed to analyze the biological
activity spectra of substances predicted by PASS program. This software provides
a flexible mechanism for selecting compounds with the required biological
activity profiles. Different kinds of biological activity are divided into six
classes: mechanisms of action, pharmacological effects, toxic/adverse effects,
metabolic terms, transporter terms and gene expression terms.
PharmaExpert analyzes the “mechanism-effect(s)” and “effect-mechanism(s)”
relationships, identifies the probable drug-drug interactions for pairs of
molecules, and searches for molecules with the required activity profile(s)
and/or acting on multiple targets (Figure 7). The analysis is based on the
“mechanism-effect(s)” relationships knowledgebase that is collected from
literature more than 12 years and includes about 8000 relationships at the
present time.
PharmaExpert also generates reports allowing users to prepare
automatically the analysis of biological activity profiles for a set of
compounds.
-10- |
Figure 7.
Example of PharmaExpert search for antineoplastic multitargeted ligands |
Revealing
New Effects and Mechanisms of Action
This is considered below on the example
of predicting the biological activity spectrum for the well-known cerebrotonic drug
Cavinton (Vinpocetin). This was launched by Gedeon Richter (Hungary) more than twenty
years ago. Its structure and predicted biological activity spectrum are
given below.
Cavinton is used in medicinal practice for twenty years. Many activities that were found in
preclinical testing and clinical trials
during this period are compared with the result of the prediction. According to the
available literature only 16 of 47 predicted activities of Cavinton are already
found. These activities are marked by "+" in the Table above.
In particular, PASS predicts the vasodilator
and spasmolytic activities (Pa=0.855 and 0.540). It corresponds with the well-known pharmacological effects of
Cavinton. It causes vasodilatation,
increases the brain blood flow and metabolism. Antihypoxic and Antiischemic
effects are also predicted for Cavinton (Pa=0.700 and 0.656 respectively). Cavinton is used for these purposes. Cavinton is predicted as Lipid
peroxidase inhibitor (Pa=0.650), agent for cognition disorders treatment
(0.648), agent for acute neurological disorders treatment (0.577), etc. Cavinton
has all these activities.
The predicted biological activity spectrum of Cavinton
suggests several new application of the
substance. Among them are: Multiple sclerosis treatment (Pa=0.900); Antineoplastic
enhancer (0.812), Antineoplastic Alkaloid (0.225) and Antitumor-Cytostatic
(0.236); Antiparkinsonian rigidity-relieving (0.271) and Antiparkinsonian
tremor-relieving (0.243); etc. While the Multiple sclerosis treatment is
predicted with high probability, all other additionally predicted activities
have relatively small values of Pa.
Similarly, the predicted activity spectrum for any compound
provides ideas for further testing. As a result some new effects and
mechanisms will be found for old substances. Varying the cutoff value of Pa one
may choose the desirable level of novelty vs. acceptable risk of negative
result.
-11-
|
No
|
Pa
|
Pi
|
Activity
|
Experiment
|
Reference
|
1
|
0.929
|
0.004
|
Peripheral vasodilator
|
|
|
2
|
0.900
|
0.000
|
Multiple sclerosis treatment
|
|
|
3
|
0.855
|
0.005
|
Vasodilator
|
+
|
[17, 18]
|
4
|
0.844
|
0.003
|
Abortion inducer
|
+
|
[17]
|
5
|
0.812
|
0.001
|
Antineoplastic enhancer
|
|
|
6
|
0.760
|
0.006
|
Coronary vasodilator
|
+
|
[19]
|
7
|
0.732
|
0.007
|
Spasmogenic
|
|
|
8
|
0.700
|
0.036
|
Antihypoxic
|
+
|
[17, 20,
21]
|
9
|
0.650
|
0.004
|
Lipid peroxidase inhibitor
|
+
|
[22, 23]
|
10
|
0.648
|
0.008
|
Cognition disorders treatment
|
+
|
[17, 24,
25]
|
11
|
0.656
|
0.021
|
Antiischemic
|
+
|
[17, 26-28]
|
12
|
0.577
|
0.013
|
Acute neurologic disorders treatment
|
+
|
[17, 18]
|
13
|
0.540
|
0.039
|
Spasmolytic
|
+
|
[18]
|
14
|
0.519
|
0.026
|
Antianginal agent
|
|
|
15
|
0.486
|
0.037
|
Antihypertensive
|
+
|
[18]
|
16
|
0.449
|
0.035
|
Antiarrhythmic
|
+
|
[29]
|
17
|
0.432
|
0.063
|
Sympatholytic
|
|
|
18
|
0.438
|
0.077
|
Sedative
|
+
|
[18]
|
19
|
0.500
|
0.152
|
Antiinflammatory, Pancreatic
|
|
|
20
|
0.328
|
0.020
|
Antidepressant, Imipramin-like
|
|
|
21
|
0.300
|
0.010
|
Thrombolytic
|
+
|
[17, 18,
20]
|
22
|
0.342
|
0.075
|
Psychotropic
|
+
|
[18]
|
23
|
0.276
|
0.023
|
Alpha 2 adrenoreceptor antagonist
|
+
|
[30]
|
24
|
0.273
|
0.029
|
Anesthetic intravenous
|
|
|
25
|
0.547
|
0.304
|
Vascular (periferal) disease treatment
|
|
|
26
|
0.225
|
0.006
|
Antineoplastic Alkaloid
|
|
|
27
|
0.291
|
0.086
|
Cholinergic antagonist
|
|
|
28
|
0.263
|
0.066
|
Benzodiazepine agonist partial
|
|
|
29
|
0.417
|
0.238
|
Insulin promoter
|
|
|
Table 2.
Predicted biological activity spectrum for Cavinton
|
Determining Relevant
Screens for a Particular Compound.
Testing can be
organized in descending order of difference (Pa-Pi) for different activities.
For example, if we consider the example of Cavinton, it should be
studied in the following tests: Peripheral vasodilator (0.929-0.004), Multiple
sclerosis treatment (0.900-0.000), Vasodilator (0.855-0.005), Abortion inducer
(0.844-0.003), Antineoplastic enhancer (0.812-0.001), Coronary vasodilator
(0.760-0.006), etc.
In this case both safety and efficacy of a new compound will be
characterized more comprehensively. Moreover, it is shown that the economic
viability of such approach to testing is more than 500% [32].
Selecting
the Most Prospective Compounds for Highthroughput Screening.
Sometimes one is interested in activities that are not yet
included in PASS, and the data are not available to train ones own knowledge
base for PASS Pro. In such cases two other strategies are
suitable.
The first strategy is based on the hypothesis that the more activities are predicted for a compound, the higher is the chance to find
any useful pharmacological action for this compound. For each compound the following value is
calculated: P = [S Pa/(Pa+Pi )]/n, where n is the number of biological activities under
consideration.
All compounds are arranged in the descending order of P
values, and only compounds with the highest values of P are selected for screening.
The second strategy is based on the hypothesis that the more "novel"
a compounds is, the higher is the probability to find a NCE. Thus, the compounds with the
highest amount of new descriptors are selected.
Both strategies were tested on datasets including 10,000 -
70,000 compounds and their efficacy is shown [31].
Another approach is to use CWM Lead Finder.
-12-
|
Experimental
Verification
The predictions of PASS were confirmed by
experiment. Some of these examples are given below.
The activity spectra have been predicted for 300 new chemical
compounds, synthesized in the Chemical-Pharmaceutical Research Institute (Novokuznetzk).
Twenty compounds have been selected for testing as probable antiulcer agents.
Nine compounds have been synthesized and tested. A potent antiulzer activity was
found for 5 of these compounds. These new antiulcer agents are NCE [33].
The economic advantage is about (300/20)100 = 1500% in this study.
The activity spectra have been predicted for 520 new chemical
compounds, synthesized in the Institute of Organic Chemistry of Russian Academy
of Science (Moscow). Fourteen compounds have been selected for testing as the
most prospective. It was shown that the results of 22 experiments made on 5
various kinds of activity, coincide with predictions in 20 cases. The accuracy
of prediction is about 90%.
Based on the predicted biological activity spectra for about
20 macroheterocyclic compounds, 2 antitumor leads were found.[34].
New antibacterial agents were found based on the biological activity spectra for derivatives of
1-amino-4-(5-arylozaxolyl-2)-butadiens-1,3 [35].
Analgesic, antiinflammatory, antioxidant and some additional
activities were predicted and confirmed by experiment for some thiazole
derivatives [36].
|
Benefits of PASS
In silico screening in the early stages of the research.
Only a 2D structure is required as input for
PASS.
Reasonable accuracy of
prediction. Average accuracy of prediction in leave one out cross-validation
(for ~205,000 compounds and ~3750 kinds of biological activity from the PASS
training set) is about 95%. PASS algorithm produce rather robust estimates of
structure-activity relationships despite the incompleteness of the training set
(Poroikov et al., 2000).
PASS parameters represent the biological space. PASS
represents the properties of molecules in biological space in contrast to many
other descriptors, which reflect the structural properties of molecules. PASS
parameters can be used for clustering of compounds
according to their biological properties, not according to their structural
similarity.
Predictions are rather fast.
Calculation of biological activity spectra for 10,000 compounds on an ordinary
PC takes about 5 min; therefore PASS can be effectively used to analyze the
databases consisting of millions of structures.
Standard structure format is used. Standard
SDFile or molfile formats (MDL/Symyx) are used
as input for PASS.
Only ordinary PC is
necessary. PASS and PharmaExpert works in personal computer under the operating
system Windows NT/XP/VISTA/Windows 7.
-13- |
Limitations
Naturally, the PASS approach has some limitations. They are:
- PASS approach can be applied to so-called
"drug-like" substances.
- PASS can be applied to the activities for which
the training set will include no less than 5 active compounds per activity.
- The accuracy of the PASS predictions are significantly
higher than random guess. PASS cannot predict the
activity spectrum for essentially new compounds that
have no descriptor in the training set
- In some cases PASS predicts both agonist's and
antagonist's (blocker and stimulator) actions simultaneously. Thus, only
experiments can clarify the intrinsic activity of a compound, but it probably
has an affinity to appropriate receptor (enzyme).
Acknowledgments
We gratefully acknowledge MDL
Information Systems, Inc. for providing ISIS/Host, ISIS/Base and the MDDR
database used in this study.
This is an edited version of the original
paper of Prof. Vladimir Poroikov, A. Kos 3.2.03, revised May 30, 2010.
|
References
by numbers
[1] Wermuth C.G., ed., Medicinal chemistry
in practice, Academic Press, London, 1996, 968 p.p.
[2] Van de Waterbeemd H., ed.,
Structure-property correlations in drug research, Landes, Austin, 1996, 210
p.p.
[3] Dean P.M., Molecular similarity in drug
design, Blackie Academic, London, 1995,
[4] Livingstone D., Data analysis for
chemists. Applications to QSAR and Chemical Product Design, Oxford Science
Publ., Oxford, 1995, 239 p.p.
[5] Kubinyi H., ed., 3D QSAR in drug
design, Escom, Leiden, 1993, 759 p.p.
[6] Avidon V., Criteria for similarity
assessment of chemical structures and the basics of informational language
for development of informational-logical system on biologically active
compounds. Chem. & Pharmaceut. J. (Rus.), 1974, 8 (8), 22-25.
[7] Piruzyan L.A., Avidon V.V., Rozenblit
A.B., et.al. Statistical analysis of the information file on biologically
active compounds. I. Data base on the structure and activity of biologically
active compounds. Chem. & Pharmaceut. J. (Rus.), 1977, 11 (4), 35-40.
[8] Piruzyan L.A., Rudzit E.A. The
methodical approaches to study biological activity of chemical compounds.
Chem. & Pharmaceut. J. (Rus.), 1976, 10 (8), 21-27.
[9] Burov Yu.V., Korolchenko L.V., Poroikov
V.V. National system for registration and biological testing of chemical
compounds: facilities for new drugs' search. Bull. Natl. Center for
Biologically Active Compounds (Rus.), 1990, No. 1, 4-25.
[10] Filimonov D.A., Poroikov V.V.,
Karaicheva E.I., et. al. (1995). Computer-aided prediction of biological
activity spectra of chemical substances on the basis of their structural
formulae: computerized system PASS. Experimental and Clinical Pharmacology (Rus),
58 (2), 56-62.
[11] Filimonov D.A., Poroikov V.V. PASS:
Computerized prediction of biological activity spectra for chemical
substances. Bioactive Compound Design: Possibilities for Industrial Use,
BIOS Scientific Publishers, Oxford, 1996, p.47-56.
[12] Poroikov V.V., Filimonov D.A.
Computerized prediction of biological activity spectra for chemical
substance - new approach to effective drug design. In: QSAR and Molecular
Modelling Concepts, Computational Tools and Biological Applications.
Barcelona: Prous Science Publishers, 1996, p.49-50.
[13] Poroikov V.V., Filimonov D.A.,
Stepanchikova A.V., et.al.. Opimization of synthesis and pharmacological
testing of new compounds based on computerized prediction of their
biological activity spectra. Chem. & Pharmaceut. J. (Rus), 1996, 30 (9),
20-23. (English translation by Consultants Bureau, New York: Pharmaceutical
Chemistry Journal, 1996, 30 (9), 570-573).
[14] Poroikov V.V. PASS, a program for the
prediction of activity spectra from molecular structure. Newsletter of The
QSAR and Modelling Society, 1997, No. 8, 12-15.
[15] Gloriozova T.A., Filimonov D.A.,
Lagunin A.A., Poroikov V.V. Testing of computer system for prediction of
biological activity spectra PASS on the set of new chemical compounds. Chem.
& Pharmaceut. J. (Rus), 1996, In press.
[16] Filimonov D.A. Comparison of
Algorithms for Computer Prediction of Biological Activity Spectra for
Chemical Compounds on the Basis of Their Structural Formulae. II Rus. Natl.
Congress "Man and Drugs", Moscow, Abstracts, 1995, 62-63.
[17] Summary of Cavinton (Vinpocetine)
Gedeon Richter, Budapest-Hungary, 1994-06-07.
[18] Mashkovskii M.D. The Pharmaceuticals,
Medicine, Moscow, 1997, v.1, 399-400.
[19] VIDAL. Pharmaceuticals in Russia.
Moscow, AstraPharmService, 1997.
[20] Kiss B., Karpati E. Acta Pharm.
Hung., 1996, 66 (5), 213-224.
[21] Plotnikova T.M., Plotnikov M.V.,
Bazhenova T.G. Bull. Exp. Biol. Med., 1991, 111 (2), 170-172.
[22] Karmazsin L., Olah V. A., Balla G.,
Makay A. Acta Paediatr. Hung. 1990, 30 (2), 217-224.
[23] Suno M., Nagaoka A. Nippon Yakurigaku
Zasshi, 1988, 91 (5), 295-299.
[24] Boda J., Karsay K., Czako L., Fugi
S., Kovacs A., Koncz I., Maczko P. A. Ther. Hung., 1989, 37 (3), 176-180.
[25] Molnar P., Gaal L. Eur. J. Pharmacol.,
1992, 215 (1), 17-22.
[26] Kiss B., Karpati E. Acta Pharm.
Hung., 1996, 66 (5), 213-224.
[27] Hadjiev D., Yancheva S.
Arzneimittelforschung, 1976, 26 (10A), 1947-1950.
[28] Rischke R., Krieglstein J.
Pharmacology, 1990, 41 (3), 153-160.
[29] Karpati E., Szporny L.
Arzneimittelforschung, 1976, 26 (10A),1908-1912.
[30] Paulo T., Toth P.T., Nguyen T.T.,
Forgacs L., Torok T.L., Magyar K. J. Pharm. Pharmacol., 1986, 38 (9),
668-73.
[31] Poroikov V.V., Filimonov D.A.,
Stepanchikova A.V. Biological Activity Spectra Prediction as a Tool to
Select the Most Prospective Compounds from Commercial and In-House
Databases. Abstr. Intern. Med. Chem. Symp., Seoul, 1997, P.143.
[32] Poroikov V.V, Filimonov D.A,
Boudunova A.P. Computer Assisted Prediction of Biological Activity Spectra:
Estimating the Effectivity of Use in High Throughput Screening. Abstr: XIVth
International Symposium on Medicinal Chemistry, Maastricht, the Netherlands,
1996, P-3.05.
[33] Trapkov V.A., Budunova A.P., Burova
O.A., Filimonov D.A., Poroikov V.V. Discovery of New Antiulcer Agents by
Computer Aided Prediction of Biological Activity. Problems in Medical
Chemistry (Moscow), 1997, 43 (1), 41-57.
[34] Islyaikin M.K., Danilova E.A., Kudrik
E.V., Smirnov R.P., Boudunova A.P., Kinzirskii A.S. Synthesis and study of
antitumor action of macroheterocyclic compounds and their complexes with
metals. Chemical & Pharmaceutical J. (Rus), 1997, 31 (8), 19-22.
[35] Maiboroda D.A., Babaev E.V.,
Goncharenko L.V. (1998). Synthesis and study of spectral and pharmacological
properties of 1-amino-4-(5-arylozaxolyl-2)-butadiens-1,3. Chemical &
Pharmaceutical J. (Rus), 32 (6), 24-28.
[36] Geronikaki A., Poroikov V.,
Hajipavlou-Litina D., Mgonzo R., Filimonov D., Lagunin A. Synthesis,
computer assisted prediction of biological activity spectra and experimental
testing of new thiazole derivatives. Quantitative Structure-Activity
Relationships, 1998, In press
-14-
TOP |
References
by abbreviation
Anzali S., Barnickel G., Cezanne B., Krug M., Filimonov D.,
Poroikov V. (2001). Discriminating between drugs and nondrugs by Prediction of
Activity Spectra for Substances (PASS). J. Med. Chem. 44: 2432-2437.
Avidon V.V. (1974). Criteria for the comparison of chemical
structures and principles of construction of an information language for a
logical information system for biologically active compounds. Pharm-Chem. J. (Rus).
8: 22-25.
Avidon V.V., Arolovich V.S., Kozlova S.P., Piruzian L.A.
(1978a). Statistical study of information file on biologically active compounds.
II. Choice of decision rule for biological activity prediction. Pharm-Chem. J. (Rus).
12: 88-93.
Avidon V.V., Arolovich V.S., Kozlova S.P., Piruzian L.A.
(1978b). Statistical investigation of large volumes of data with respect to the
biological activity of compounds III. Selection of a determinant for predicting
biological activity. Pharm-Chem. J. (Rus). 12: 99–106.
Avidon V.V., Pomerantsev I.A., Rozenblit A.B., Golender V.E.
(1982). Structure-activity relationship oriented languages for chemical
structure representation. J. Chem. Inf. Comput. Sci. 22: 207-214.
Avidon V.V., Arolovich V.S., Blinova V.G., Freidina A.M.
(1983). Statistical investigation of the data file on biologically active
compounds. V. Allowance for the novelty of the chemical structure in the
prediction of the biological activity by an improved method of substructural
analysis. Pharm-Chem. J. (Rus). 17: 59-62.
Burov Yu.V., Poroikov V.V., Korolchenko L.V. (1990). National
system for registration and biological testing of chemical compounds: facilities
for new drugs search. Bull. Natl. Cent. Biol. Active Compnds (Rus.). No. 1:
4-25.
Delmas F., Di Giorgio C., Robin M., Azas N., Gasquet M.,
Detang C., Costa M., Timon-David P., Galy J.P. (2002). In vitro activities of
position 2 substitution-bearing 6-nitro- and 6-aminobenzothiazoles and their
corresponding anthranilic acid derivatives against Leishmania infantum and
Trichomonas vaginalis. Antimicrob. Agents Chemother. 46: 2588–2594.
Di Giorgio C., Delmas F., Filloux N., Robin M., Seferian L.,
Azas N., Gasquet M., Costa M., Timon-David P., Galy J.P. (2003). In vitro
activities of 7-substituted 9-chloro and 9-amino-2-methoxyacridines and their
bis- and tetra-acridine complexes against Leishmania infantum. Antimicrob.
Agents Chemother. 47: 174–180.
Di Giorgio C., Delmas F., Ollivier E., Elias R., Balansard G.,
Timon-David P. (2004). In vitro activity of the beta-carboline alkaloids harmane,
harmine, and harmaline toward parasites of the species Leishmania infantum. Exp.
Parasitol. 106: 67–74.
Dolzhenko A.V., Kolotova N.V., Koz'minykh V.O., Vasilyuk M.V.,
Kotegov V.P., Novoselova G.N., Syropyatov B.Ya., Vakhrin M.I. (2003).
Substituted amides and hydrazides of dicarboxylic acids. Part 14. Synthesis and
antimicrobial and antiinflammatory activity of 4-antipyrylamides,
2-thiazolylamides, and 1-triazolylamides of some dicarboxylic acids. Pharm-Chem.
J. 37: 149–151.
Filimonov D.A., Poroikov V.V., Karaicheva E.I., Kazarian R.K.,
Budunova A.P., Mikhailovskii E.M., Rudnitskikh A.V., Goncharenko L.V., Burov
Yu.V. (1995). Computer-aided prediction of biological activity spectra of
chemical substances on the basis of their structural formulae: computerized
system PASS. Exper. Clin. Pharmacol. (Rus). 58: 56-62.
Filimonov D.A., Poroikov V.V. (1996). PASS: computerized
prediction of biological activity spectra for chemical substances. In: Bioactive
Compound Design: Possibilities for Industrial Use, BIOS Scientific Publishers,
Oxford (UK), pp.47-56.
Filimonov D., Poroikov V., Borodina Yu., Gloriozova T. (1999).
Chemical Similarity Assessment through multilevel neighborhoods of atoms:
definition and comparison with the other descriptors. J. Chem. Inf. Comput. Sci.
39: 666-670.
Filimonov D.A., Poroikov V.V. (2006). Prediction of biological
activity spectra for organic compounds. Russian Chemical Journal, 50 (2), 66-75
Filimonov D.A., Poroikov V.V. (2008). Probabilistic approach
in activity prediction. In: Chemoinformatics Approaches to Virtual Screening.
Eds. Alexandre Varnek and Alexander Tropsha. Cambridge (UK): RSC Publishing,
182-216.
Filimonov D.A., Zakharov A.V., Lagunin A.A., Poroikov V.V.
(2009). QNA based “Star Track” QSAR approach. SAR & QSAR Environ. Res. 20:
679-709.
Geronikaki A., Babaev E., Dearden J., Dehaen W., Filimonov D.,
Galaeva I., Krajneva V., Lagunin A., Macaev F., Molodavkin G., Poroikov V.,
Saloutin V., Stepanchikova A., Voronina T. (2004). Design of new anxiolytics:
from computer prediction to synthesis and biological evaluation. Bioorg. Med.
Chem. 12: 6559-6568.
Geronikaki A., Druzhilovsky D., Zakharov A., Poroikov V.
(2008a). Computer-aided predictions for medicinal chemistry via Internet. SAR &
QSAR Environ. Res. 19: 27-38.
Geronikaki A.A., Lagunin A.A., Hadjipavlou-Litina D.I.,
Elefteriou P.T., Filimonov D.A., Poroikov V.V., Alam I., Saxena A.K. (2008b).
Computer-aided discovery of anti-inflammatory thiazolidinones with dual
cyclooxygenase/lipoxygenase inhibition. J. Med. Chem. 51: 1601-1609.
Goel R.K., Kumar V., Mahajan M.P. (2005). Quinazolines
revisited: search for novel anxiolytic and GABAergic agents. Bioorg .Med. Chem.
Lett. 15: 2145–2148.
Golender V.E., Rozenblit A.E. (1978). Computer Methods for
Drug Design. Riga: Zinatne, 232 pp.
Golender V.E., Rosenblit A.B. (1983). Logical and
Combinatorial Algorithms for Drug Design, Research Studies Press, Wiley&Sons,
352 pp.
Labanauskas L., Brukstus A., Udrenaite E., Bucinskaite V.,
Susvilo I., Urbelis G. (2005). Synthesis and anti-inflammatory activity of
1-acylaminoalkyl-3,4-dialkoxybenzene derivatives. Il Farmaco. 60: 203–207.
Lagunin A., Stepanchikova A., Filimonov D., Poroikov V.
(2000). PASS: prediction of activity spectra for biologically active substances.
Bioinformatics. 16: 747-748.
Lagunin A.A., Gomazkov O.A., Filimonov D.A., Gureeva T.A.,
Dilakyan E.A., Kugaevskaya E.V., Elisseeva Yu.E., Solovyeva N.I., Poroikov V.V.
(2003). Computer-aided selection of potential antihypertensive compounds with
dual mechanisms of action. J. Med. Chem. 46: 3326-3332.
PASS program package, © Filimonov D.A., Poroikov V.V.,
Gloziozova T.A., Lagunin A.A. Russian State Patent Agency, N 2006613275 of
15.09.2006.
PharmaExpert program package, © Lagunin A.A., Poroikov V.V.,
Filimonov D.A., Gloziozova T.A. Russian State Patent Agency, N 2006613590 of
16.10.2006.
Poroikov V.V., Filimonov D.A., Boudunova A.P. (1993).
Comparison of the Results of Prediction of the Spectra of Biological Activity of
Chemical Compounds by Experts and the PASS System. Automat Document Math
Linguistics. 27: 40-43.
Poroikov V.V., Filimonov D.A., Borodina Yu.V., Lagunin A.A.,
Kos A. (2000). Robustness of biological activity spectra predicting by computer
program PASS for non-congeneric sets of chemical compounds. J. Chem. Inform.
Comput. Sci. 40: 1349-1355.
Poroikov V., Akimov D., Shabelnikova E., Filimonov D. (2001).
Top 200 medicines: can new actions be discovered through computer-aided
prediction? SAR and QSAR in Environmental Research, 12 (4), 327-344.
Poroikov V.V., Filimonov D.A. (2002). How to acquire new
biological activities in old compounds by computer prediction. J. Comput. Aid.
Molec. Des., 16 (11), 819-824.
Poroikov V.V., Filimonov D.A., Ihlenfeldt W.-D., Gloriozova
T.A., Lagunin A.A., Borodina Yu.V., Stepanchikova A.V., Nicklaus M.C. (2003).
PASS Biological Activity Spectrum Predictions in the Enhanced Open NCI Database
Browser. J. Chem. Inform. Comput. Sci. 43: 228-236.
Poroikov V., Filimonov D. (2005). PASS: Prediction of
Biological Activity Spectra for Substances. In: Predictive Toxicology. Ed. by
Christoph Helma. Taylor & Francis, 459-478.
Poroikov V., Lagunin A., Filimonov D. (2005). PharmaExpert:
diseases, targets and ligands – three in one. QSAR and Molecular Modelling in
Rational Design of Bioactive Molecules. Eds. Esin Aki Sener, Ismail Yalcin,
Ankara (Turkey), CADD & D Society, 514-515.
Poroikov V., Filimonov D., Lagunin A., Gloriozova T., Zakharov
A. (2007). PASS: Identification of probable targets and mechanisms of toxicity.
SAR & QSAR in Environmental Research., 18 (1-2), 101-110.
Sadym A., Lagunin A., Filimonov D., Poroikov V. (2003).
Prediction of biological activity spectra via Internet. SAR & QSAR Environ. Res.
14: 339-347.
Stepanchikova A.V., Lagunin A.A., Filimonov D.A., Poroikov
V.V. (2003). Prediction of biological activity spectra for substances:
Evaluation on the diverse set of drugs-like structures. Cur. Med. Chem. 10:
225-233.
|
|