DOES QUANTUM CHEMISTRY HAVE A PLACE IN CHEMINFORMATICS?
Timothy Clark
Computer-Chemie-Centrum, Friedrich-Alexander-Universität Erlangen-Nürnberg,
Nägelsbachstrasse 25, 91052 Erlangen, Germany.
Received: 9th
July 2002 / Published: 15th
May 2003
Abstract
The possible role of quantum mechanical (QM) techniques in cheminformatics
is discussed. The advantages, disadvantages and capabilities of QM and its
applicability to databases of thousands of molecules are discussed. The critical
relationship between quantitative structure-property relationships (QSPRs)
and the quality of the experimental data is discussed using aqueous solubility
as an example. The use of QM-derived descriptors to investigate physical property
space and to characterise compounds as drug-like or non-drug-like is illustrated.
Finally, it is pointed out that not QM-calculations, but rather a knowledge
of the molecular electron density is necessary for the examples shown, and
a technique that can reproduce the electron density without QM-calculations
is presented.
Introduction
Quantum mechanical calculations are not usually considered to be applicable
to cheminformatics, although we have shown that semiempirical MO-calculations
can be used on complete databases (1) and can play an important role in many
cheminformatics applications (2,3). This article is intended to provide an
overview of the applicability and capabilities of quantum mechanical techniques
for cheminformatics and to discuss the relationships between data, descriptors
and properties in quantitative structure-property relationships (QSPRs). Finally,
an alternative technique for deriving the molecular electron density without
quantum mechanics will be described.
Typically, cheminformatics applications use 2D- or very simple, classically
derived 3D-descriptors for quantitative structure-activity relationships (QSARs)
and QSPRs. As the border between, for instance, QSAR and pharmacophore-based
high-throughput virtual screening is very poorly defined, many cheminformatics
tasks can be considered to be simply more traditional QSAR or QSPR applied
to larger numbers of molecules. In this respect, the constant advances in
hard- and software performance tend to make the border even less clear because
larger datasets become manageable with every advance. We have shown (2,3)
that complete databases of tens of thousand of compounds can be treated with
economical quantum mechanical techniques and have discussed the advantages
of detailed quantum mechanical descriptions of molecules for QSPR (2) and
QSAR (3). What, however, has changed since in the two years since references
2 and 3 were written? Is quantum mechanics still a useful tool for cheminformatics?
Will it displace more traditional techniques? Are there alternatives?
Why Use Quantum Mechanics?
The advantages of using semimpirical MO-calculations to calculate molecular
descriptors have been described before (2,3) and will only be outlined briefly
here. The resolution of the molecular electrostatic properties is generally
higher (i.e. atoms are not treated isotropically) in quantum mechanical calculations.
This generally results in better descriptions of the molecular electrostatic
potential in important regions of the molecular surface (i.e. where bonding
interactions occur). Furthermore, electronic properties such as polarisability,
ionisation potentials, electron affinities, dipole and higher multipole moments
etc. and descriptors derived from them often prove to be very useful descriptors,
especially in QSPR-applications. An example of the use of such descriptors
is given in our recent work on the hydrogen-bond acceptor strengths of nitrogen
heterocycles (4). Note however, that many of the properties listed above can
be obtained from the electron density, so that efficient methods for generating
an accurate electron density without quantum mechanics are potentially of
great interest for cheminformatics.
However, for the moment we should assess the reliability of the most computationally
economical for of molecular orbital theory, the modern semiempirical techniques
MNDO (5), AM1 (6), and PM3 (7) for calculating the properties listed above.
These techniques are parameterised to reproduce experimental heats of formation,
molecular structures, dipole moments and ionisation potentials (from Koopmans'
theorem). These properties (especially the dipole moment) ensure that the
electron densities calculated are generally quite accurate, so that the molecular
electrostatics calculated by semiempirical techniques agree well with high
level ab
initio data (8, 9). However, what about less directly parameterised
properties like the polarisability, which is often thought to be very difficult
to calculate and which requires very extensive basis sets at ab
initio levels of theory? The solution, as for many properties in semiempirical
theory, is to use a fast and effective level of theory to calculate the property
in question and parameterise the method against experimental data. In this
case, Rivail and his coworkers (10) published a very fast and simple variational
technique for calculating the molecular electronic polarisability. However,
this variational method is prone to systematic and element-specific errors,
so that we (11) parameterised the element-specific integrals involved especially
to reproduce the experimental data. This resulted in a fast and accurate method
for calculation of the molecular electronic polarisability that is also amenable
to partitioning into group, atomic or even orbital contributions (3, 12).
The molecular electronic polarisability proves to be an important descriptor
in most of our QSPR models and plays a very significant role in describing
physical property space (13).
The general impression is that CPU-requirements for quantum mechanical calculations
preclude them from being used for cheminformatics applications. This is not
necessarily the case. In a first feasibility study (1), we were able to process
the entire Maybridge database (about 53,000 compounds) on a 128-processor
SGI Origin 2000 in half a day. However, computer performance has increased,
and above all prices have decreased, since this study was performed, so that
a Euro 1000 computer can process about 1,500 typical druglike compounds per
day (full optimisations with AM1 or PM3).
Thus, there are relatively few real obstacles to using semiempirical MO-theory
for cheminformatics. Do we, however, really need "better" descriptors, for
instance for QSPR applications?
Experimental Data and QSPR Models
QSPR-models are derived by calculating descriptors for each molecule in the
dataset and then using an interpolation technique (regression, neural net
etc.) to relate the descriptors to the property. The quality of the model
obtained is necessarily limited by the quality of the experimental data. Have
we, however, already reached the limit of data-accuracy for some properties?
Figure 1 shows results for an AM1/neural net model (14) for aqueous solubility
based on a training set of solubilities for 559 compounds at 298K. This model
gives a standard deviation between calculation and experiment of 0.51 log
units, a mean unsigned error (MUE) of 0.40, a maximum error of 1.67 and 35%
of the predictions outside the calculated (15) (±
1 standard deviation) error bars.
Figure
1. The performance of an artificial neural net QSPR-model14
for aqueous solubility based on AM1 descriptors. The error bars shown are
calculated according to the procedure outlined in reference 15.
These results are typical. A literature survey reveals that for 11 studies
(some of which used related experimental data, not just calculated descriptors)
using datasets between 399 and 1312 compounds, the six published standard
deviations between calculation and experiment average to 0.60 log units (including
one extreme outlier at 0.16) and the five published root mean square deviations
(RMSD) average to 0.68. If we ignore the outlier, the standard deviations
range from 0.57 to 0.79 and the RMSDs from 0.62 to 0.76. Thus, all published
models perform very similarly, although they use very different types of descriptors
and interpolation techniques. What is not different, however, are the data.
How reliable are experimental solubility measurements? Yalkowsky and Banerjee16
have outlined the experimental techniques for and difficulties encountered
in measuring aqueous solubility. They also include a table of measured solubilities
for some "extremely
hazardous substances" (reference 16, Appendix C). As a crude estimate
of the reliability of experimental data, Figure 2 shows the highest experimental
value plotted against the lowest for the 18 compounds for which more than
one measurement is listed.
The standard deviation between the two sets of experimental values is 0.79
log units, the MUE is 0.48, the RMSD 0.76 and the largest error 1.93. Even
though a sample of only 18 compounds cannot be considered reliable, the conclusion
seems clear that QSPR-models cannot be very much better than this correlation.
Figure
2. A plot of the highest experimental values for aqueous solubility
against the lowest for 18 compounds taken from a table of "extremely
hazardous substances" (reference 16, Appendix C).
We may even be in a situation in which our available descriptors could describe
solubility very much better than they do if better experimental data were
available (i.e. the models are limited by the data, not by the descriptors
or the interpolation technique). Is it the worth developing new, "better"
descriptors if the experimental data do not justify their use?
Making Better Use of the Available Data
It is not necessarily true that a QSPR model for a given property cannot be
better than the dataset used to train for that property. Consider, for instance,
molecular mechanics (force field) calculations for alkanes. Several high quality
force fields give heats of formation for alkanes that are more reliable than
experimentally measured values. This is possible because the force field (an
unusual type of QSPR model) is not only trained to reproduce heats of formation,
but also, for instance, structures, isomerisation energies etc. Thus, a very
general QSPR model that is trained to reproduce several directly related properties
can be more reliable than the experimental values for one or more of these
properties. We recently tested (17) the applicability of a single QSPR model
to more than one property for a simple example, vapour pressures. Note that
we have not yet trained the model using data for more than one property, but
have only tested the possibility that a single model can be used for several
properties, in this case vapour pressure, boiling point and heat of vapourisation.
Our first QSPR model for vapour pressure (15) used only data measured at 298K
taken from the Beilstein database. This limits the training/test dataset to
551 compounds. Other authors (18) have corrected some of their experimental
data to 298K using the published temperature dependence. This is a legitimate
way to extend the available data. It is, however, not necessarily the best
because it does not allow use of data for compounds for which the temperature
dependence of the vapour pressure is not known.
Our approach (17) is to include the temperature of the measurement as an additional
descriptor in the QSPR model. This forces the interpolation technique (in
this case a feed forward neural net) to learn the temperature dependence as
part of the model. We can then gain extra information by interrogating the
trained net about this temperature dependence. In this case, the boiling point
at atmospheric pressure can be calculated by finding the temperature at which
the vapour pressure reaches this value and the heat of vapourisation can be
derived using the Clausius-Clapeyron equation. Figure 3 gives an idea of exactly
now much more data is available in this case than for a model limited to 298K.
Figure
3. Distribution of the experimental data used to train a variable
temperature vapour pressure model with respect to the vapour pressure itself
and the temparature of the measurement. The horizontal line indicates the
data available at 298 K.
Instead of the data for 551 compounds, we now have 8,542 data points at temperatures
between 76K and 800K for 2,349 different compounds. Applying the model to
boiling points gives a standard deviation between calculated values and experiment
of 28.6K and a MUE of 18.7K for our boiling point dataset, compared with values
of 21.9K and 13.5K, respectively, for a model (19) trained only to reproduce
boiling points. A small sample of heats of vaporisation also showed a standard
deviation between calculation and experiment of only 4.7 kcal mol-1
and a maximum error of 12.0 kcal mol-1.
This is, however, only an indication of what is possible. We must learn to
make the most possible use of all the reliable
data available. Martin Hicks has pointed out (20) that there are different
types of experimental data. Boiling points, for instance, are measured routinely,
and probably not very carefully, for every liquid organic compound reported
in the literature. These data are not a good basis for a boiling point model
because they are incidental properties used to characterise the compound.
Our experience suggests that in many cases it is not recorded that the boiling
point was measured at reduced pressure, or even that the temperature scale
used (Celsius or Kelvin) is reported wrongly. There are, however, certainly
data measured in studies whose main aim was to determine boiling points accurately.
These data can all be used, even those at reduced pressure, because they are
simply another way of reporting an experimental vapour pressure. Similarly,
heats of vaporisation are usually measured very carefully for very pure compounds
and so should be as reliable as is possible for measurements on a difficult
quantity. Heats of vaporisation cannot be used directly to train a neural
net by back-propagation because they require the slope of the vapour pressure
change with temperature. However, training feed forward neural nets by genetic
algorithms, rather than direct back-propagation, is an established technique
(21) that allows us to use derived properties to determine the error function
to be minimised as well as those generated directly by the neural net. We
are now investigating the effectiveness of such an approach in which linked
physical properties are used to generate more general, and thus more reliable,
QSPR models (22).
Another potential use of information-rich molecular descriptors is to map
chemical compounds according to their physical properties, as has been demonstrated
by Oprea et
al. (23) in their "Chemical
GPS" technique. We have used this approach to investigate the clustering
of druglike compounds on such a map, primarily in order to distinguish drugs
from nondrugs, but as a more general goal to be able to relate new compounds
to known closely related ones.
Physical Properties, Descriptors and Compound Maps
The idea of mapping compounds according to, for instance, their physical properties
is that physically similar but possibly chemically diverse compounds should
occur close to each other, and thus have similar ADME properties etc. Thus,
rather than conventional QSPR being used to predict individual properties,
compounds would be compared with their known neighbours and assumed to behave
similarly. So far so good, but how do we decide which descriptors (and how
many) to use for the mapping? In order to be able to treat this question rationally,
we should at least have some idea of the dimensionality and the descriptors
appropriate for describing physical property space. Lipinksi (24) has described
physical property space as being low dimensional (i.e. we only need a few
descriptors to describe physical properties). We investigated (13) both the
dimensionality and the nature of physical property space by calculating a
range of descriptors known to be suitable for QSPR models for the entire Maybridge
database plus a set of about 2,500 selected drugs. The principle components
of the 26 descriptors that appear in many of our QSPR models were then calculated
in order to characterise physical property space. The conclusions of this
study are that 8-9 descriptors are enough to describe physical properties
and that these can be loosely classified as shown in Table 1.
Table
1. Qualitative descriptions of the principle components of descriptors
used to describe physical property space (13).
|
Principle component number
|
% variance
explained
|
Main descriptors
|
Interpretation
|
|
1
|
23.3
|
Polarisability, molecular weight, surface area, globularity
|
Size, shape
|
|
2
|
18.5
|
Maximum MEP*, mean positive and negative MEPs, total variance (25)
|
Complementary electrostatic surface descriptors
|
|
3
|
9.1
|
Minimum MEP, mean negative MEP, balance parameter
(25)
|
|
4
|
7.6
|
Total MEP-derived charges on nitrogens (26), number of H-bond acceptors
|
Complementary hydrogen-bonding descriptors
|
|
5
|
5.4
|
Total MEP-derived charges on H and O (26), minimum MEP, number of
aromatic rings
|
|
6
|
5.4
|
Dipole moment, dipolar density (27)
|
Dipolar polarity
|
|
7-9
|
3.9 - 4.3
|
Total MEP-derived charges on different types of atoms
|
Chemical diversity
|
*
MEP = molecular electrostatic potential at the solvent-excluded surface.
The interpretations of the individual PCs give an intuitive picture of the
factors determining the physical properties of molecules. Most important are
the size and shape, followed by two complementary descriptors that describe
the higher multipole character of the electrostatics at the surface of the
molecules. Note that the dipole moment is not important in these two descriptors.
Next, come two complementary descriptors that essentially describe the hydrogen-bond
donor and acceptor properties (including aromatic ring acceptors) and then
the simple dipolar polarity, which perhaps surprisingly only accounts for
just over 5% of the variance described by all the descriptors. PCs 7-9 are
essentially atom counts that describe chemical diversity. These descriptors
probably occur in QSPR models to correct for systematic AM1 or PM3 errors
for some elements.
Thus, one appropriate approach to mapping chemicals according to their physical
properties would be to use the first six principle components described in
Table 1 as descriptors and to ignore the "chemical diversity" PCs in order
to train, for instance, a Kohonen net. According to our analysis, the resulting
map should cluster compounds with similar physical properties. This work is
still in progress.
Descriptors can, however, also be selected to map for a more limited goal,
such as distinguishing drugs from nondrugs (13). We have used recursive partitioning
(28) In order to select descriptors for such a mapping, thus sacrificing some
of the unsupervised quality of the Kohonen net by using a supervised descriptor
selection process. The resulting map can not only distinguish drugs from nondrugs
with about the same efficiency as other published techniques (29), but also
differentiate between, for instance, hormones and other drugs (13).
The above applications use descriptors that are predominantly derived from
the electron density, in our examples calculated using semiempirical MO-theory.
However, it would be more efficient to use classical methods to optimise the
molecular geometries and then a non-quantum technique to approximate the electron
density. We are currently developing such a technique on which to base future,
faster QSPR methods.
A Non Quantum Mechanical Approach to Electron Density
The principle of electronegativity equalisation (30) enjoyed some popularity
20 years ago and is the basis of the still popular Gasteiger-Marsili charges
(31). However, all such models known to us calculate isotropic net atomic
charges, rather than considering the inherent atomic anisotropies caused by
the bonding situation. We are currently developing a procedure (32) that considers
this atomic anisotropy by considering hybrid atomic orbitals and their interactions.
Currently, we parameterise the model to reproduce the AM1 electron density,
but high level ab
initio or density functional data could also be used. Figure 4 shows
a flow chart of the calculations steps involved.
The input to the program is a simple Lewis structure, which may be derived
from a force-field calculation (in which case the bonds, hybridisation etc.
are fully defined) or a set of 3D-coordinates, from which a Lewis structure
must be derived using bond-distance criteria.
Figure
4. Flow chart of the classical procedure used to calculate electron
densities (32).
The hybrid atomic orbitals are then calculated for each nonhydrogen atom using
the procedure outlined previously for the hybrid orbital/point charge technique
(33). The hybrid orbitals are then combined to bonding and antibonding localised
molecular orbitals (LMOs) for the s-framework
and the lone pairs are identified. The remaining hybrid orbitals are considered
components of the p-system,
which is treated using a parameterised Hückel-like procedure with variable
electronegativities. Negative hyperconjugation (donation from lone pairs into
neighbouring S*-LMOs)
(34) corrections are added specifically at this stage. The s-system
is then allowed to undergo a mutual polarisation step analogous to electronegativity
equalisation, but based on the electrostatic potentials at the nuclei (35).
This is an iterative procedure that usually converges within 4-10 cycles.
The initial parameterisation was restricted to compounds of the elements H,
C, N and O and without p-systems.
A training set of 52 representative compounds was calculated with AM1 and
the geometries thus obtained used for the parameterisation. The error function
was based on the one-atom (4x4)
blocks of the AM1 density matrix. The error in the off-diagonal elements was
weighted with a factor of 0.5 compared with the diagonal elements. The error
function was minimised with either the simplex or the BFGS optimisation algorithms.
A validation set of 25 cycloalkanes, ethers, amines, alcohols, sugars and
steroids was used to test the resulting parameters. The preliminary results
are shown in Table 2.
Table
2. Results obtained for the validation set of 25 compounds.
"RMS" is the root mean square deviation between the target (AM1) value and
that calculated by the non-quantum mechanical procedure.
|
Property
|
Numbers
|
RMS
|
|
Diagonal density matrix elements
|
1774
|
0.029
|
|
Off-diagonal one-atom density matrix elements
|
1986
|
0.029
|
|
Coulson atomic charges
|
781
|
0.037
|
The quality of the fit can also be expressed in properties that are more familiar.
The steroid 1,
for instance, is part of the validation set.
The parameterised procedure reproduces the AM1-calculated dipole moment with
an error of 0.12 Debye, the root mean square of the Coulson net atomic charges
is 0.027 and the root mean square deviation of the on-atom density matrix
elements is 0.019. The molecular electrostatics of the molecule are thus well
described by the new procedure, which requires only milliseconds of CPU-time.
Figure 5 shows a plot of the molecular electrostatic potential at the solvent-excluded
surface of the molecule. Only the areas with the largest deviation (below
-5 kcal mol-1
and above 15 kcal mol-1)
are shown and the colour scale (blue to red) ranges from -9 kcal mol-1
to 19 kcal mol-1.
The procedure outlined above would result in a vastly increased computational
capacity for electron-density-based cheminformatics applications.
Figure
5. The molecular electrostatic potential at the solvent-excluded surface
of 1.
Only the areas with the largest deviation (below -5 and above 15 kcal mol-1)
between the classical technique and the full AM1 calculation are shown. The
colour scale (blue to red) ranges from -9 to 19 kcal mol-1.
It is computationally very efficient, so that large databases or even complete
enzymes can be treated easily. The electron density is also polarisable so
that, for instance, the method could be used to calculate the electrostatics
of the classical part of a hybrid QM/MM calculation without major inconsistencies
in the electrostatic treatment of the classical and the quantum mechanical
parts. Similarly, classically derived electron densities can be used as economical
initial guess densities for MO-calculations. In this case, they have the advantage
that no matrix diagonalisation is necessary, making the technique eminently
suitable for parallel computers. A fast method for calculating accurate electron
densities for proteins would also help the refinement of their X-ray structures.
Conclusions
Quantum mechanical methods, especially semiempirical MO-theory, can be used
for cheminformatics applications. Advances in computer hardware have made
semiempirical MO geometry optimisations on databases of 50-100,000 compounds
commonplace on economical compute clusters. However, if we are to use the
additional information provided by quantum mechanics relative to classical
techniques, we must adopt a new paradigm for our QSPR and QSAR models, which
are now often limited not by the descriptors, but rather by the quality of
the training data. More general, physically rational models are needed that
relate several physical properties to each other in order to eliminate biases
or weaknesses in the training data for any one property. It is, for instance,
unlikely that a dataset of heats of vapourisation will suffer from the same
systematic problems as one for boiling points. High quality, possibly quantum
mechanical descriptors will be needed should such compound QSPR models prove
successful.
Many of our current descriptors, however, only require the electron density,
not a wavefunction. An extension of the well known electronegativity equalisation
technique to the calculations of a detailed electron density may prove to
offer the ideal compromise between the detail offered by quantum mechanical
calculations and the computational efficiency of classical methods. Our initial
model has demonstrated the viability of such techniques. It promises to be
of very general use wherever a fast, relatively accurate calculation of the
electron density is required. As the algorithm is inherently parallel, it
can be used for very large systems and may even be suitable for use in a polarisable
force field.
Acknowledgements
This work was supported by the Fonds der Chemischen Industrie. I especially
thank all my coworkers, who are named in the corresponding references and
who have contributed enormously to the developments described above.
References
[1] Beck, B., Horn, A., Carpenter, J. E., Clark, T. (1998). J.
Chem. Inf. Comput. Sci. 38:1214.
[2]
Quantum Cheminformatics: An Oxymoron? (Part 1) T. Clark in
Chemical
Data Analysis in the Large: The Challenge of the Automation Age,
M. G. Hicks (Ed.),
Proceedings
of the Beilstein-Institut Workshop, May 22
nd
- 26
th,
2000, Bozen, Italy:
http://www.beilstein-institut.de/bozen2000/proceedings
[3]
Quantum Cheminformatics: An Oxymoron?, (Part 2) T. Clark, in Rational
Approaches to Drug Design, H.-D. Höltje and W. Sippl (Eds), Prous Science,
Barcelona, (2001).
[4] M. Hennemann & T. Clark (2002). J.
Mol.Model. 8:95-101.
[5] Dewar, M. J. S. & Thiel, W. (1977). J. Am. Chem. Soc. 99 :4899,4907;
Thiel, W. (1998). Encyclopedia
of Computational Chemistry, Schleyer, P. v. R., Allinger, N. L., Clark, T.,
Gasteiger, J., Kollman, P. A., Schaefer, H. F.,III and Schreiner, P. R. (Eds),
Wiley, Chichester, (1998), 3:1599.
[6] Dewar, M. J. S., Zoebisch, E. G., Healy, E. F., Stewart, J. J. P. (1985). J.
Am. Chem. Soc., 107:3902; Holder, A. J. Encyclopedia
of Computational Chemistry; Schleyer, P. v. R., Allinger, N. L., Clark, T.,
Gasteiger, J., Kollman, P. A., Schaefer III, H. F., Schreiner, P. R. (Eds),
Wiley, Chichester, (1998),1:8.
[7] Stewart, J. J. P. (1989). J.
Comput. Chem., 10,
209:221; Stewart, J. J. P.; Encyclopedia
of Computational Chemistry, Schleyer, P. v. R., Allinger, N. L., Clark, T.,
Gasteiger, J., Kollman, P. A., Schaefer III., H. F., Schreiner, P. R. (Eds),
Wiley, Chichester, (1998), 3:2080.
[8] Rauhut, G. & Clark, T. (1993).
J. Comput. Chem. 14:503.
[9] Beck, B., Rauhut, G., Clark, T. (1994). J.
Comput. Chem. 15:1064.
[10] Rinaldi, D. & Rivail, J.-L. (1974). Theoret.
Chim. Acta 32:57; 243; Rivail, J.-L. & Carter, A. (1978). Mol.
Phys. 36:1085.
[11] Schürer, G., Gedeck, P., Gottschalk, M., Clark, T. (1999). Int.
J. Quant. Chem. 75:17.
[12] Martin, B., Gedeck, P., Clark, T. (2000). Int.
J. Quant. Chem. 77:473.
[13] Brüstle, M., Beck, B., Schindler, T., King, W., Mitchell, T., Clark, T. (2002).
J.
Med. Chem. in press.
[14] Beck, B. & Clark, T.; manuscript in preparation.
[15] Beck, B., Breindl, A., Clark, T. (2000). J.
Chem. Inf. Comput. Sci. 40:1046.
[16] Yalkowsky, S. H. & Banerjee, S. Aqueous
Solubility, Marcel Dekker, New York, (1992).
[17] Chalk, A. J., Beck, B., Clark, T. (2001). J.
Chem. Inf. Comput. Sci. 41:1053.
[18] McClelland, H. E. & Jurs, P. C. (2000). J.
Chem. Inf. Comput. Sci. 50:967.
[19] Chalk, A. J., Beck, B., Clark, T. (2001). J.
Chem. Inf. Comput. Sci. 41:457.
[20] Hicks, M. (2002). Personal communication and comment at the Bozen Workshop.
[21] Montana, D. J. & Davis, L. D., in Proceedings
of the International Joint Conference on Artificial Intelligence, Morgan
Kaufmann, San Francisco, (1989); M. Mitchell, An
Introduction to genetic Algorithms, The MIT Press, Cambridge, MA, (1998).
[22] Brüstle, M. & Clark, T., unpublished.
[23] Oprea, T. I. & Gottfries, J. (2001). ChemGPS:
A Chemical Space Navigation Tool, in Rational
Approaches to Drug Design: 13th
European Symposium on QSAR, H.-D. Höltje and
W. Sippl (Eds), Prous Science, Barcelona, p. 437, (2001); Oprea, T. I. &
Gottfries, J. (2001). J.
Comb. Chem. 3:157.
[24] Lipinski, C. A., Lombardo, F., Dominy, B. W., Feeney, P. J. (1997). Avd.
Drug. Delivery Rev. 23:3.
[25] Murray, J. S. & Politzer, P. (1998). J.
Mol. Struct (Theochem). 425:107; Murray, J. S., Lane, P., Brinck, T., Paulsen, K., Grince, M. E., Politzer, P. (1993). J.
Phys. Chem. 97:9369.
[26] Beck, B., Clark, T. Glen, R. C. (1995). J.
Mol. Model. 1:176.
[27] Cronce, D. T., Famini, G. R., DeSoto, J A., Wilson, L. Y. (1998). J.
Chem. Soc., Perkin Trans. 2:1293.
[29] Sadowski, J. & Kubinyi, H. A. (1998). J.
Med. Chem. 41:3325; Wagener, M. & van Geersestein, V. J. (2000). J.
Chem. Inf. Comput. Sci. 40:280; Ajay, Walters, W. P. Murcko, M. A. (1998). J.
Med. Chem. 41:3314.
[30] Sanderson, R. T. (1974). Educ.
Chem. 11:80.
[31] Gasteiger, J. & Marsili, M. (1978). Tetrahedron
Lett. 34:3181.; Gasteiger, J. & Marsili, M. (1980). Tetrahedron
36:3219.; Marsili, M. & Gasteiger, J. (1981). Stud.
Phys. Theor. Chem. 16:56; Marsili, M. & Gasteiger, J. (1981). Croat.
Chem. Acta 53:601.
[32] Horn, A. C. & Clark, T., unpublished.
[33] Gedeck, P., Schindler, T., Alex, A., Clark, T. (2000). J.
Mol. Model., 6:452.
[34] Schleyer, P. v. R. & Kos, A. J. (1983). Tetrahedron
39:1141.
[35] Politzer, P. & Parr, R. G. (1974). J.
Chem. Phys. 61:4258; Politzer, P., Daiker, K. C., Trefonas, P.,
III. (1979). J.
Chem. Phys. 70:4400; Politzer, P. (1980).
Isr. J. Chem. 19:224; Politzer, P. & Sjoeberg, P. (1983). J.
Chem. Phys. 78:7008; Politzer, P. & Levy, M. (1987). J.
Chem. Phys. 87:5044; Erratum (1988). J.
Chem. Phys. 89:2590.
Published in "Molecular Informatics: Confronting Complexity", Martin
G. Hicks & Carsten Kettner (Eds.), Proceedings of the Beilstein-Institut
Workshop, May 13th
- 16th
2002, Bozen, Italy