Chapter 5

Molecular Classification of N-Aryloxazolidinone-5-carboxamides as Human Immunodeficiency Virus Protease Inhibitors

Francisco Torrens1; Gloria Castellano2    1 Institut Universitari de Ciència Molecular, Universitat de València, Edifici d’Instituts de Paterna, València, Spain
2 Departamento de Ciencias Experimentales y Matemáticas, Facultad de Veterinaria y Ciencias Experimentales, Universidad Católica de Valencia San Vicente Mártir, València, Spain

Abstract

Algorithms for classification and taxonomy are proposed in this chapter based information entropy (IE) and its production. The 38 N-aryloxazolidinone-5-carboxamides (NCAs), for human immunodeficiency virus (HIV) protease (PR) inhibition, are classified using seven characteristic chemical properties of different moieties: R1/2, R3–6 on different phenyls and R7. Many classification algorithms are based on IE. When applying some procedures to moderate-sized sets, excessive number of results appear compatible with the data and suffer combinatorial explosion. However, after the equipartition conjecture (EC), one has a selection criterion among different variants that results from classification between hierarchical trees. The IE permits classifying NCAs and agrees with principal component analyses. A periodic table (PT) of the properties of PR inhibitors is obtained. The first five features denote the group, while the last two denote the period in the PT. In the PT, NCAs in the same group present similar properties; NCAs, also in the same period, offer maximum resemblance.

Keywords

Periodic law

periodic property

periodic table (PT)

molecular classification

information entropy (IE)

equipartition conjecture (EC)

human immunodeficiency virus (HIV)

acquired immunodeficiency syndrome (AIDS)

protease inhibitor

carboxamide

Acknowledgments

One of the authors, F. T., acknowledges support from the Spanish Ministerio de Economía y Competitividad (Project No. BFU2013-41648-P) and EU ERDF.

1 Introduction

Acquired immunodeficiency syndrome (AIDS) is an end-stage disease that is manifested by the gradual deterioration of the immune competence of infected patients (Alberts et al., 2002). Human immunodeficiency virus type 1 (HIV-1) is the causative organism for AIDS (Volberding and Deeks, 2010); it belongs to the family lentiviridae of pathogenic retroviruses, which depends on RNA to encode the genetic message (Chakravarty, 2006). The protease (PR) of HIV-1 is a homodimeric aspartyl PR; it cleaves a 55-kDa polyprotein precursor: by that process, it produces smaller functional protein fragments (e.g., p17, p24, p9, p7), which are responsible for packing and infectivity for budding virions. The inhibition of PR inhibits the processing steps; it causes noninfectious and immature progeny virions. Some HIV-1 PR inhibitors (such as indinavir, ritonavir, and nelfinavir) are marketed as anti-HIV-1 drugs, none of which is devoid of adverse effects (Katzung, 2004). Phenotypic cross/resistance restricted the use of these drugs (Tripathi, 2003). The search for HIV-1 PR inhibitors with better therapeutic efficacy and lesser toxicity is in progress.

Ali et al. (2006) described the design, synthesis, and bioevaluation of HIV-1 PR inhibitors, incorporating N-phenyloxazolidinone-5-carboxamides into (hydroxyethylamino)sulphonamide scaffold as P2 ligands. Series of inhibitors were synthesized with changes at P2 phenyloxazolidinone and P2’ phenylsulphonamide moieties. The compounds with the (S)-enantiomer of substituted phenyloxazolidinones at P2 showed potent inhibitory activities versus HIV-1 PR. Inhibitors possessing 3/4-acetyl and 3-trifluoromethyl groups at oxazolidinone phenyl ring were the most potent, with Ki values in the low pM range. The electron-donating groups 4-methoxy and 1,3-dioxolane were preferred at P2’ phenyl, as compounds with other substitutions showed less binding affinity. Attempts to replace the isobutyl at P1’ with small cyclic moieties caused a loss of affinity. Crystal structure analysis of both most potent inhibitors, in a complex with HIV-1 PR, provided information on the inhibitor–PR interactions. In inhibitor and enzyme complexes, oxazolidinone carbonyl H-bonded with conserved PR Asp29. Potent inhibitors were selected from each series by incorporating various phenyloxazolidinone-based P2 ligands, and their activities versus a panel of multidrug-resistant (MDR) PR variants were determined. The most potent PR inhibitor started with tight affinity for the wild-type enzyme (Ki = 0.8 pM) and, even versus MDR variants, it retained pM to low nM Ki, which is comparable to the best PR inhibitors approved by the US Food and Drug Administration (FDA). Halder and Jha (2010) applied quantitative structure–activity relationships (QSARs) to some N-aryloxazolidinone-5-carboxamides (NCAs, cf. Figure 5.1) to find structural requirements for more active anti-HIV-1 PR agents. Wang et al. (2011) reported mangiferin as anti-HIV-1 targeting PR and effective versus resistant strains. Zhang et al. (2012) investigated interactions for HIV drug cross-resistance among PR inhibitors. Liu et al. (2013) informed 4862 F, an inhibitor of HIV-1 PR, from a Streptomyces culture.

f05-01-9780128025086
Figure 5.1 General molecular structure of N-aryloxazolidinone-5-carboxamide.

A simple code is proposed that could be useful for establishing a chemical structure–biosignificance relationship (Benzecri, 1984; Varmuza, 1980). The starting point is to use information entropy (IE) for pattern recognition. The IE is formulated based on the similarity matrix between two biochemical species. As IE is weakly discriminating for classification, the more powerful concepts of IE production and its equipartition conjecture (EC) are introduced (Tondeur and Kvaalen, 1987). In earlier publications, it was analyzed the PTs of local anesthetics (Castellano-Estornell and Torrens-Zaragozá, 2009; Torrens and Castellano, 2006, 2011a), HIV inhibitors (Torrens and Castellano, 2009a, 2010, 2011b, 2012a–d, 2014), anti-cancers (Torrens and Castellano, 2009b, 2013a, 2013b), phenolics (Castellano et al., 2012), flavomoids (Castellano et al., 2013), stilbenoids (Castellano et al., 2014) in Ganoderma (Castellano and Torrens, 2015). The main aim of this chapter is to develop the code-learning potentialities and, since molecules are more naturally described by structured representation of varying sizes, study general approaches to structured information processing. A second goal is to present NCA PT. A third objective is to validate PT with an external property, anti-HIV-1 PR activity, which is not used in the development of PT.

2 Computational method

The key problem in classification studies is to define similarity indices with several criteria. The first step in quantifying the similarity concept for NCAs is to list the most important moieties. A vector of properties i¯=<i1,i2,ik,>si1_e should be associated with every NCA i, whose components correspond to characteristic groups in a hierarchical order according to the importance of their pharmacological potency. If the mth portion of a molecule is more significant for the inhibitory effect than the kth portion, then m < k. The components ik are either 1 or 0, depending on whether an identical portion of rank k is either present or absent in NCA i, compared to a reference. The analysis includes seven regions of structural variation in NCAs: R1–7 positions showing diverse substitution patterns. The structural elements of an NCA are ranked according to their contribution to inhibitory potency as R3 > R6 > R7 > R4 > R5 > R2 > R1. Index i1 = 1 denotes R3 = H (0 otherwise), i2 = 1, R6 = H, i3 = 1, R7 = iPr, i4 = 1, R4 = H, i5 = 1, R5 = OCH3, i6 = 1, R2 = H and i7 = 1, and R1 = Ac. In NCA 5, R3 = R6 = R4 = R2 = H, R7 = iPr, R5 = OCH3, and R1 = Ac; its vector is <1111111>, which was selected as a reference because of its greatest inhibitory activity versus HIV-1. Table 5.1 contains the vectors associated with 38 NCAs. Vector <1111110> is associated with NCA 1 since R3 = R6 = R4 = R2 = R1 = H, R7 = iPr and R5 = OCH3.

Table 5.1

Vector of Properties of NCAs for Molecular Substitutions (R3, R6, R7, R4, R5, R2, R1)

1. –H –H –iPr –H –OCH3 –H –H < 1111110 >

2. –H –H –iPr –H –OCH3 –H –F < 1111110 >

3. –H –H –iPr –H –OCH3 –F –F < 1111100 >

4. –H –H –iPr –H –OCH3 –H –CF3 < 1111110 >

5. –H –H –iPr –H –OCH3 –H –Ac < 1111111 >

6. –H –H –iPr –H –OCH3 –Ac –H < 1111100 >

7. –H –H –iPr –H –OCH3 –H –OCH3 < 1111110 >

8. –H –H –iPr –H –NH2 –H –H < 1111010 >

9. –H –H –iPr –H –NH2 –H –F < 1111010 >

10. –H –H –iPr –H –NH2 –F –F < 1111000 >

11. –H –H –iPr –H –NH2 –H –CF3 < 1111010 >

12. –H –H –iPr –H –NH2 –H –Ac < 1111011 >

13. –H –H –iPr –H –NH2 –Ac –H < 1111000 >

14. –H –H –iPr –O–CH2–O– –H –F < 1110010 >

15. –H –H –iPr –O–CH2–O– –F –F < 1110000 >

16. –H –H –iPr –O–CH2–O– –H –CF3 < 1110010 >

17. –H –H –iPr –O–CH2–O– –H –Ac < 1110011 >

18. –H –H –iPr –O–CH2–O– –Ac –H < 1110000 >

19. –H –H –iPr –F –OCH3 –H –F < 1110110 >

20. –H –H –iPr –F –OCH3 –F –F < 1110100 >

21. –H –H –iPr –F –OCH3 –H –CF3 < 1110110 >

22. –H –H –iPr –F –OCH3 –H –Ac < 1110111 >

23. –H –H –iPr –F –OCH3 –Ac –H < 1110100 >

24. –H –H –iPr –H –OCF3 –H –CF3 < 1111010 >

25. –H –H –iPr –H –OCF3 –H –Ac < 1111011 >

26. –H –H –iPr –OCH3 –H –H –H < 1110010 >

27. –H –H –iPr –OCH3 –H –Ac –H < 1110000 >

28. –H –H –cPr –H –H –H –F < 1101010 >

29. –H –H –cPr –H –OCH3 –F –F < 1101100 >

30. –H –H –cPr –H –OCH3 –Ac –H < 1101100 >

31. –H –H –2-TPa –OCH3 –H –H –H < 1100010 >

32. –H –H –2-TPa –OCH3 –H –H –F < 1100010 >

33. –H –H –2-TPa –OCH3 –H –Ac –H < 1100000 >

34. –F –F –2-TPa –H –F –H –H < 0001010 >

35. –F –F –2-TPa –H –F –H –F < 0001010 >

36. –F –F –2-TPa –H –F –Ac –H < 0001000 >

37.–H –H –2THFb –OCH3 –H –H –H < 1100010 >

38. –H –H –2THFb –OCH3 –H –H –F < 1100010 >

a 2-TP = thiophene

b 2THF = 2-tetrahydrofuran

Denote by rij (0 ≤ rij ≤ 1) the similarity index of two NCAs associated with vectors i¯si2_e and j¯si3_e, respectively. The similitude is characterized by the similarity matrix R = [rij]. The similarity index between two NCAs i¯=<i1,i2,ik>si4_e and j¯=<j1,j2,jk>si5_e is

rij=ktkakkk=1,2,,

si6_e  (5.1)

where 0 ≤ ak ≤ 1, and tk = 1 if ik = jk, but tk = 0 if ik ≠ jk. The definition assigns a weight (ak)k to any property involved in the description of molecule i or j.

3 Classification algorithm

The grouping algorithm uses the stabilized similarity matrix obtained by applying the max–min composition rule o, defined by

RoSij=maxkminkrikskj,

si7_e  (5.2)

where R = [rij] and S = [sij] are matrices of the same type, and (RoS)ij is the (i,j)th element of the RoS matrix (Cox, 1994; Kaufmann, 1975; Kundu, 1998; Lambert-Torres et al., 1999). When applying the composition rule max–min iteratively so that R(n + 1) = R(n) o R, an integer n exists such that R(n) = R(n + 1) = … Matrix R(n) is called the stabilized similarity matrix. The importance of stabilization lies in the fact that in classification, it generates a partition into disjoint classes. The stabilized matrix is designated by R(n) = [rij(n)]. The grouping rule follows, to wit: i and j are assigned to the same class if rij(n) ≥ b. The class of i noted that isi23_e is the set of species j that satisfies the rule: rij(n) ≥ b. The matrix of classes is

Rn=rij=maxs,trstsi,tj,

si8_e  (5.3)

where s stands for any index of the species belonging to class isi23_e (similarly for t and jsi25_e). Eq. (5.3) means finding the largest similarity index between species of two different classes.

4 Information entropy

In information theory, the IE h measures the surprise that a source emitting sequences can give (Shannon, 1948a, 1948b). Consider the use of a qualitative spot test to determine the presence of Fe in a sample of water. Without any history of testing, an analyst must begin by assuming that the two outcomes 0/1 (Fe absent or present) are equiprobable with probabilities 1/2. When up to two metals may be present in the sample (e.g., Fe and Ni), four possible outcomes exist, ranging from neither being present (0,0) to both (1,1), with probabilities 1/22. Which of the four possibilities turns up is determined by using two tests, each with two observable states. With three elements, eight possibilities exist with probabilities 1/23; three tests are needed. The following pattern relates the uncertainty and information needed to resolve it. The number of possibilities is expressed to a power of 2. The power to which 2 must be raised to give the number of possibilities N is defined as the logarithm to base 2 of that number. Information and uncertainty are defined by the logarithm to base 2 of the number of possible analytical outcomes: log2 N.

The initial uncertainty is defined in terms of the probability of the occurrence of every outcome; e.g., for the above-mentioned probabilities, the following definition results: I = H = log2 N = log2 1/p = –log2 p, where I is the information contained in an answer given that there were N possibilities, with H as the initial uncertainty resulting from the need to consider N possibilities and p, the probability of every outcome if all N possibilities are equally likely to occur. The expression is generalized to a situation in which the probability of every outcome is not the same. If one knows that some elements are more likely to be present than others, the equation is adjusted so that the logarithms of individual probabilities suitably weighted are summed: H = –Σ pi log2 pi, where Σ pi = 1. Consider the original example, except that in this case, past experience showed that 90% of the samples contained no Fe. The degree of uncertainty is calculated by: H = –(0.9 log2 0.9 + 0.1 log2 0.1) = 0.469 bits. For a single event occurring with probability p, the degree of surprise is proportional to –ln p. By generalizing the result to a random variable X (which can take N possible values x1, …, xN with probabilities p1, …, pN), the average surprise received on learning the X value is –Σ pi ln pi. The IE associated with R similarity matrix is

hR=i,jrijlnriji,j1rijln1rij.

si9_e  (5.4)

Denote by Cb the set of classes and by Rbsi10_e, the similarity matrix at grouping level b. The IE satisfies several properties: (i) h(R) = 0 if either rij = 0 or rij = 1; (ii) h(R) is at its maximum if rij = 0.5 (i.e., when the imprecision is at its maximum); (iii) hRbhRsi11_e for any b (i.e., classification leads to IE loss); (iv) hRb1hRb2si12_e if b1 < b2 (i.e., IE is a monotone function of the grouping level b).

5 The EC of entropy production

In the classification algorithm, every hierarchical tree corresponds to IE dependence on the grouping level, and an h–b diagram is obtained. Tondeur and Kvaalen (1987) proposed the EC of IE production is proposed as the selection criterion among different variants, resulting from classification among hierarchical trees. According to EC, for a given charge, the binary tree (BT) with the best configuration is one in which IE production is the most uniformly distributed. One proceeds by analogy by using IE instead of thermodynamic entropy. The EC implies linear dependence (i.e., constant IE production) along scale b so that the EC line is

heqp=hmaxb.

si13_e  (5.5)

Since the classification is discrete, the way of expressing EC would be a regular staircase function. The best variant is chosen to be one that minimizes the sum of the squares of the deviations:

SS=bihheqp2.

si14_e  (5.6)

6 Learning procedure

Learning procedures (LPs), similar to those encountered in stochastic methods, are implemented (White, 1989). Consider a partition into classes as good from observations, which corresponds to a reference similarity matrix S = [sij], obtained for equal weights a1 = a2 = … = a and for an arbitrary number of fictious properties. Consider also the same set of species as in the good classification and actual properties. The degree of similarity rij is computed by Eq. (5.1), giving matrix R. The number of properties for R and S differs. The LP consists in finding classification results for R, as close as possible to the good classification. The a1 weight is taken to be constant and the following weights a2, a3,… are subjected to random variations. A new similarity matrix is obtained via Eq. (5.1) and new weights. The distance between the partitions into classes, characterized by R and S, is

D=ij1rijln1rij1sijijrijlnrijsij0rij,sij1.

si15_e  (5.7)

The definition was suggested by that introduced in information theory by Kullback (1959), in order to measure the distance between two probability distributions. It is a measure of the distance between the R and S matrices. Since a corresponding classification exists for every matrix, two classifications will be compared by the distance, which is a nonnegative quantity that approaches zero as the resemblance between R and S rises. The result of the algorithm is a set of weights allowing classification. The procedure was applied to the synthesis of complex BTs using IE (Baez and Dolan, 1995; Crans, 2000; Iordache, 2011, 2012; Iordache et al., 1993; Leinster, 2004).

Our code MolClas is a simple, reliable, efficient, and fast procedure for molecular classification, based on IE-production EC, according to Eqs. (5.1)(5.7). With IE, but without EC, an excessive number of results appear compatible with the data and suffer a combinatorial explosion; however, after EC, the best configuration is that in which IE production is most uniformly distributed. The MolClas reads the number of properties and molecular properties; it allows the optimization of the coefficients; it optionally reads the starting coefficients and the number of iteration cycles. The correlation matrix can be either calculated by MolClas or read from the input file. The MolClas allows correlation-matrix transformation in the range [–1,1] to [0,1]; it calculates the similarity matrix of the property in symmetric storage mode; it applies the graphical correlation model to obtain the partial correlation diagram (PCD); it computes the classifications, tests if the groupings are different, calculates the distances between classifications, computes the similarity matrices of groupings, works out classifications of IE, optimizes the coefficients, performs single/complete-linkage hierarchical cluster analyses, and plots cluster diagrams; it was written not only to analyze IE-production EC, but also to explore the world of molecular classification.

7 Calculation results and discussion

Structural data of anti-HIV-1 NCAs reported by Halder and Jha (2010) were used as the model data set. The matrix of Pearson correlation coefficients (PCCs) was calculated between the pairs of vectors <i1,i2,i3,i4,i5,i6,i7> of 38 NCAs. The PCCs are illustrated in a PCD, which could contain high (r ≥ 0.75), medium (0.50 ≤ r < 0.75), low (0.25 ≤ r < 0.50), and no (r < 0.25) partial correlation. Pairs of inhibitors with high partial correlations show similar vectors; however, the results should be taken with care because NCA with constant vector <1111111> (Entry 5) shows null standard deviation, causing the greatest partial correlations r = 1 with any NCA, which is an artifact. With EC, correlations are illustrated in PCD, which contains 598 high (cf. Figure 5.2, red lines) and 105 zero (black) partial correlation. Notice that 3 out of 37 high partial correlations of Entry 5 were corrected: its correlations with Entries 34–36 are zero partial correlations.

f05-02-9780128025086
Figure 5.2 Partial correlation diagram: High (red) intercorrelations of NCAs.

The grouping rule in the case of equal weights ak = 0.5 for b1 = 0.97 allows the following classes:

C–b1 = (1–7)(8–13,24,25)(14–18,26,27)(19–23)(28)(29,30)(31–33,37,38)(34–36).

Eight classes are obtained with associated IE h–R–b1 = 21.18. The BT matching to <i1,i2,i3,i4,i5,i6,i7>/C–b1 (cf. Figure 5.3) is calculated (IMSL, 1989; Jarvis and Patrick, 1973; Tryon, 1939); it provides a BT of Table 5.1 that separates the same classes: the data bifurcate into classes 5, 1–4, and 6–8 with 1, 7, 8, 7, 5, 2, 5, and 3 NCAs, respectively (Page, 2000). NCAs 1–7 with the greatest inhibitory activity are grouped into the same class. The NCAs in the same cluster appear highly correlated in PCD (Figure 5.2).

f05-03-9780128025086
Figure 5.3 N-aryloxazolidinone-5-carboxamides dendrogram with anti-HIV activity: level b1.

At level b2 with 0.93 ≤ b2 ≤ 0.94, the set of classes turns out to be:

C–b2 = (1–13,24,25)(14–23,26,27)(28–30)(31–33,37,38)(34–36)

Five classes result, and IE decays to h–R–b2 = 7.98. The BT matching to <i1,i2,i3,i4,i5,i6,i7> and C–b2 (cf. Figure 5.4) divides the same five classes: 1–5 with 15, 12, 3, 5, and 3 NCAs, respectively. Again, NCAs with the greatest inhibitory activity are grouped into the same class. The NCAs belonging to the same cluster appear highly correlated in a PCD, in qualitative agreement with BT (Figures 5.2 and 5.3).

f05-04-9780128025086
Figure 5.4 N-aryloxazolidinone-5-carboxamides dendrogram with anti-HIV activity: level b2.

Table 5.2 shows an analysis of the set containing 1–38 classes, agreeing with PCD and BTs (Figures 5.25.4).

Table 5.2

Classification Level, Number of Classes and Entropy for the Vector of Properties of NCAs

Classification Level bNumber of ClassesEntropy h
1.0038444.62
0.9918100.94
0.981463.35
0.97821.18
0.9457.98
0.9144.33
0.8832.75
0.7521.20
0.0710.05

In view of PCD and BTs (Figures 5.25.4), the data are split into the same classes above. Figure 5.5 displays the BT. Again, NCAs with the greatest inhibitory activity are grouped into the same class.

f05-05-9780128025086
Figure 5.5 Dendrogram of N-aryloxazolidinone-5-carboxamides with anti-HIV inhibitory activity.

The illustration of the classification in a radial tree (RT; cf. Figure 5.6) shows the same groupings as already discussed, in qualitative agreement with the PCD and BTs (Figures 5.25.5). Again, NCAs with the greatest inhibitory activity (1–13, etc.) are grouped into the same cluster.

f05-06-9780128025086
Figure 5.6 RT of N-aryloxazolidinone-5-carboxamides with anti-HIV inhibitory activity.

The SplitsTree program allows analyzing the cluster analysis (CA) data (Huson, 1998). Based on the method of split decomposition, it takes as input the distance matrix and produces a graph that represents the relationships between the taxa. For ideal data, the graph is an RT, whereas less ideal data will give rise to an RT-like net, which is interpreted as possible evidence for conflicting data. Furthermore, as split decomposition does not attempt to force the data onto an RT, it can provide a good indication of how RT-like are given data. The splits graph (SG) for 38 NCAs in Table 5.1 (cf. Figure 5.7) shows that 1–27 collapse, as well as 28–33-37-38 and 34–36. It reveals no conflicting relationship between classes. It is in qualitative agreement with PCD, BTs, and RT (Figures 2–6).

f05-07-9780128025086
Figure 5.7 SG of N-aryloxazolidinone-5-carboxamides with anti-HIV inhibitory activity.

Usually, in structure–property relationships (SPRs), the data file contains less than 100 objects and more than1000 X-variables. So many X-variables exist that no one can discover by inspection the patterns in objects. The principal components analysis (PCA) is useful to summarize the information contained in an X-matrix and put it in an understandable form (Hotelling, 1933; Jolliffe, 2002; Kramer, 1998; Patra et al., 1999; Shaw, 2003; Xu and Hagler, 2002). The PCA works by decomposing X-matrix as the product of two smaller matrices P and T. The loading matrix (P) with information about variables contains a few vectors, the principal components (PCs), which are obtained as the linear combinations (LCs) of the original X-variables. The score matrix (T), with information about the objects, is such that every object is described in terms of the projections onto PCs instead of the original variables: X = TP’ + E, where denotes the transpose matrix.

The information not contained in the matrices remains as the unexplained X-variance in the residual matrix (E). Every PCi is a new coordinate expressed as LC of the old features xj: PCi = Σjbijxj. The new coordinates PCi are called scores or factors, while the coefficients bij, loadings. Scores are ordered according to their information content with regard to the total variance among all objects. Score–score plots (SPs) show the positions of compounds in the new coordinate system, while loading–loading plots (LPs) show the locations of features that represent compounds in the new coordinates. The PCs show two properties: (i) They are extracted in decaying order of importance. The first PC F1 always contains more information than F2, F2 more than F3, etc.; and (ii) every PC is orthogonal to one another. No correlation exists between information contained in the different PCs. A PCA was performed for NCAs. The importance of PCA factors F1–F7 for {i1,i2,i3,i4,i5,i6,i7} is collected in Table 5.3. Factors are LCs of the vectors. Factor F1 explains 36% of variance (64% error); F1/2, 55% of variance (45% error); F1–3, 72% of variance (28% error); etc.

Table 5.3

Importance of PCA Factors for the Vectors of Properties of NCAs

FactorEigenvaluePercentage of VarianceCumulative Percentage of Variance
F12.5203870436.0136.01
F21.3247338718.9254.93
F31.2215104817.4572.38
F40.7063007610.0982.47
F50.672633249.6192.08
F60.554434627.92100.00
F70.000000000.00100.00

t0020

The PCA factor loadings are shown in Table 5.4.

Table 5.4

PCA Loadings for the Vectors of Properties of NCAs

PCA Factor Loadingsa
Prprt.F1F2F3F4F5F6F7
i10.59637237−0.10094594−0.115891130.190848370.102088320.271797780.70710678
i20.59637237−0.10094594−0.115891130.190848370.102088320.27179778−0.70710678
i30.422299810.187967730.25838970−0.217128530.36848370−0.732557660.00000000
i4−0.179146780.231379290.664425710.347146930.490545020.334309910.00000000
i50.22944642−0.128970770.632418850.07483329−0.72193407−0.063074940.00000000
i60.009855430.66131843−0.252512060.62028918−0.25230381−0.224460650.00000000
i70.159707530.660890360.03458013−0.60756837−0.133773220.386635560.00000000

t0025

a Loadings greater than 0.7 are in bold.

The PCA F1–7 profile for the vectors is listed in Table 5.5. For factors F1 and F7, variables {i1,i2} show the greatest weight in profile; however, F1 cannot be reduced to both variables without a 29% error, although F7 is reduced to both variables with a 0% error. For F2, the variable i6 presents the greatest weight; notwithstanding, F2 cannot be reduced to two variables {i6,i7} without a 13% error. For F3, the variable i4 assigns greatest weight; nevertheless, F3 cannot be reduced to two variables {i4,i5} without a 16% error. For F4, the variable i6 consigns the greatest weight; however, F4 cannot be reduced to two variables {i6,i7} without a 25% error. For F5, the variable i5 represents the greatest weight; however, F5 cannot be reduced to two variables {i4,i5} without a 24% error. For F6, the variable i3 shows the greatest weight; nevertheless, F6 cannot be reduced to two variables {i3,i7} without a 31% error.

Table 5.5

Profile of the PCA Factors for the Vectors of the Properties of NCAs

Factor% of i1% of i2% of i3% of i4% of i5% of i6% of i7
F135.5735.5717.833.215.260.012.55
F21.021.023.535.351.6643.7343.68
F31.341.346.6844.1540.006.380.12
F43.643.644.7112.050.5638.4836.91
F51.041.0413.5824.0652.126.371.79
F67.397.3953.6611.180.405.0414.95
F750.0050.000.000.000.000.000.00

t0030

Note: Percentages greater than 50% are in bold.

In PCA F2–F1 SP, NCAs with the same vector collapse. Five NCA classes are distinguished: (i) class 1, with 15 compounds (0 < F1 < F2, cf. Figure 5.8, top); (ii) grouping 2, with 12 substances (F1 > F2 ≈ 0, right); cluster 3, with 3 molecules (F1 < F2 ≈ 0, center); class 4, with 5 organics (0 > F1 > F2, bottom); and cluster 5 (3 units, F1 < < F2, left). Classification agrees with PCD, BTs, RT, and SG (Figures 5.25.7).

f05-08-9780128025086
Figure 5.8 PCA F2 versus F1 scores plot for anti-HIV NCAs.

From the PCA factor loadings of NCAs (Table 5.4), F2–F1 LP (cf. Figure 5.9) depicts seven vectors, and properties R3/6 collapse. As a complement to SP (Figure 5.8) for loadings (Figure 5.9), NCAs in class 1, located at the top, present a contribution of R1 = Ac situated on the same position in Figure 5.8. The NCAs in grouping 2 on the right have greater contributions of R7 = iPr. The NCAs in classes 3 and 4 in the middle-bottom present a contribution of R3 = R6 = H. The NCAs in class 5 on the left present a contribution of R4 = H. Two classes of properties are distinguished in LP: class 1 {R3,R6,R7,R4,R5,R2} (F1 > F2, Figure 5.9 bottom); and grouping 2 {R1} (F1 < F2, top).

f05-09-9780128025086
Figure 5.9 PCA F2 versus F1 loadings plot for anti-HIV NCAs.

Instead of 38 NCAs in the space ℜ7 of seven vectors, consider seven properties in the space ℜ38 of 38 NCAs. The BT of the vectors (cf. Figure 5.10) separates the first property R1 (class 2), then R2, R5, R4, R7, R3, and R6 (class 1), agreeing with PCA LP (Figure 5.9).

f05-10-9780128025086
Figure 5.10 Dendrogram for the vectors of properties corresponding to anti-HIV NCAs.

The RT for the vectors (cf. Figure 5.11) separates the same classes as given previously, agreeing with PCA LP and BT (Figures 5.9 and 5.10).

f05-11-9780128025086
Figure 5.11 RT for the vectors of properties corresponding to anti-HIV NCAs.

The SG for the vectors (cf. Figure 5.12) shows that R3–7 collapse. No conflicting relationship appears between the classes. The SG agrees with PCA LP, BT, and RT (Figures 5.95.11).

f05-12-9780128025086
Figure 5.12 SG for the vectors of properties corresponding to anti-HIV NCAs.

Another PCA was performed for the vectors, which are described by their occurrence in molecules, and new factors are LCs of the compounds. The use of factor F1 explains 41% of the variance (59% error), F1/2, 58% of variance (42% error), F1–3, 74% of variance (26% error), etc. In PCA F2–F1 SP, R3 -6 collapse. Two classes of properties are distinguished: class 1 {R3,R6,R7,R4,R5,R2} (F1 > F2, cf. Figure 5.13, right); and grouping 2 {R1} (F1 < F2, left), agreeing with PCA LP, BT, RT, and SG (Figures 5.95.12).

f05-13-9780128025086
Figure 5.13 PCA F2 versus F1 scores plot for vectors of properties of NCAs.

The recommended format for the NCA PT (cf. Table 5.6) shows that they are classified first by i1, then by i2, i3, i4, i5, i6, and, finally, by i7. Periods of eight units are assumed; e.g., group g00010 stands for <i1,i2,i3,i4,i5> = <00010 >: <0001000> (–F –F –2-TP –H –F –Ac –H), etc. The NCAs in the same column appear close in PCD, BTs, RT, SG, and PCA SP (Figures 5.25.8).

Table 5.6

Periodic Properties for N-aryloxazolidinone-5-carboxamide Derivatives

Propertyg00010g11000g11010g11011g11100g11101g11110g11111
p00F F 2-TP H F Ac HH H 2-TP OCH3 H Ac HH H cPr H OCH3 F F
H H cPr H OCH3 Ac H
H H iPr –O–CH2–O– F F
H H iPr –O–CH2–O– Ac H
H H iPr OCH3 H Ac H
H H iPr F OCH3 F F
H H iPr F OCH3 Ac H
H H iPr H NH2 F F
H H iPr H NH2 Ac H
H H iPr H OCH3 F F
H H iPr H OCH3 Ac H
p10F F 2-TP H F H H
F F 2-TP H F H F
H H 2-TP OCH3 H H H
H H 2-TP OCH3 H H F
H H 2THF OCH3 H H H
H H 2THF OCH3 H H F
H H cPr H H H FH H iPr –O–CH2–O– H F
H H iPr –O–CH2–O– H CF3
H H iPr OCH3 H H H
H H iPr F OCH3 H F
H H iPr F OCH3 H CF3
H H iPr H NH2 H H
H H iPr H NH2 H F
H H iPr H NH2 H CF3
H H iPr H OCF3 H CF3
H H iPr H OCH3 H H
H H iPr H OCH3 H F
H H iPr H OCH3 H CF3
H H iPr H OCH3 H OCH3
p11H H iPr –O–CH2–O– H AcH H iPr F OCH3 H AcH H iPr H NH2 H Ac
H H iPr H OCF3 H Ac
H H iPr H OCH3 H Ac

t0035

The variation of property P (HIV-1 PR inhibitory activity) of vector < i1,i2,i3,i4,i5,i6,i7 > (cf. Figure 5.14) is expressed in the decimal system P = 106i1 + 105i2 + 104i3 + 103i4 + 102i5 + 10i6 + i7 versus structural parameters {i1,i2,i3,i4,i5,i6,i7} for NCAs. Most data collapse. Parameter i1 shows greater variation i2, i3, i4, i5 and i6. The property P was not used in the development of the PT and serves to validate it. The results agree with PT of properties with vertical groups defined by {i1,i2,i3,i4,i5} and horizontal periods, by {i6,i7}.

f05-14-9780128025086
Figure 5.14 Variation of property P(p) of NCAs versus counts {i1,i2,i3,i4,i5,i6,i7}.

The change of property P of vector < i1,i2,i3,i4,i5,i6,i7 > in base 10 versus the number of the group in the NCA PT (cf. Figure 5.15) reveals minima and maxima corresponding to compounds with <i1,i2,i3,i4,i5ca. <00010> (group g00010) and <11111> (g11111), respectively. Most points collapse, especially in groups 5–8. Periods p00, p10, and p11 represent rows 1–3 in Table 5.6. The function P(i1,i2,i3,i4,i5,i6,i7) denotes series of periodic waves (PWs) limited by minima and maxima, which suggest a PW behavior that recalls the form of a trigonometric function. For <i1,i2,i3,i4,i5,i6,i7>, a minimum is shown. The distance in <i1,i2,i3,i4,i5,i6,i7> units between each pair of consecutive minima is eight, which coincides with NCA sets in successive PWs. The minima occupy analogous positions in the curve and are in phase. The representative points in phase should correspond to elements in the same group in PT. For <i1,i2,i3,i4,i5,i6,i7> minima, coherence exists between the two representations; however, the consistency is not general. Wave comparison shows two differences: (i) all PWs are incomplete, and (ii) PWs p00 and p10 are staircaselike. Most characteristic points of the plot are minima that lie about group g00010. Values of <i1,i2,i3,i4,i5,i6,i7> are repeated, as the periodic law (PL) states.

f05-15-9780128025086
Figure 5.15 Variation of property P(p) of N-aryloxazolidinone-5-carboxamides versus group number.

An empirical function P(p) reproduces different <i1,i2,i3,i4,i5,i6,i7> values. A minimum of P(p) has meaning only if it is compared with the former P(p–1) and later P(p + 1) points, needing to fulfill:

Pminp<Pp1

si16_e

Pminp<Pp+1.

si17_e  (5.8)

Order relations [Eq. (5.8)] should repeat at determined intervals equal to PW size and are equivalent to

PminpPp1<0

si18_e

Pp+1Pminp>0.

si19_e  (5.9)

As Eq. (5.9) is valid only for minima, other more general expressions are desired for all values of p. The differences D(p) = P(p + 1)–P(p) are calculated by assigning every value to NCA p:

Dp=Pp+1Pp.

si20_e  (5.10)

Instead of D(p), R(p) = P(p + 1)/P(p) is taken by assigning them to NCA p. If PL were general, elements in the same group in analogous positions in different PWs would satisfy

eitherDp>0orDp<0

si21_e  (5.11)

eitherRp>1orRp<1

si22_e  (5.12)

However, the results show that this is not the case, so that PL is not general, existing some anomalies. Change of D(p) versus group number (cf. Figure 5.16) presents a lack of coherence between <i1,i2,i3,i4,i5,i6,i7> Cartesian and PT representations. Most results collapse, mainly in groups 5–7. The datum for group 8, PW p00, should be taken with care because it was calculated with the first point in the following PW. If consistency were rigorous, all points in every PW would have the same sign. Trend exists in points to give D(p) > 0 for lower groups, but not for greater ones. However, irregularities exist in which NCAs for successive PWs are not always in phase.

f05-16-9780128025086
Figure 5.16 Variation of property D(p) = P(p + 1) – P(p) versus group number. P is the vector property.

The change of R(p) versus group number (cf. Figure 5.17) confirms the lack of constancy between Cartesian and PT charts. Most data collapse, especially in groups 5–7. If steadiness were exact, all points in every PW would show R(p) as either lesser or greater than 1. A trend in the points exists to give R(p) > 1 for lower groups, but not for greater ones. Notwithstanding, confirmed incongruities exist in which NCAs for the successive PWs are not always in phase.

f05-17-9780128025086
Figure 5.17 Variation of property R(p) = P(p + 1)/P(p) versus group number. P is the vector property.

The IE approach identifies cliffs. The method was applied to 74 flavonoids and is being used for 177 phenolic compounds. The technique is not sensitive to the number of compounds; otherwise, it could result in one-case classes that could be outliers. The impact of noisy experimental data on the method performance is expected to be small. Matched molecular pair analysis (MMPA) focuses on the effects of specific structural changes on properties of interest; its assumption is that differences in a property are predicted more accurately than the property itself; it is applied to structural diverse data sets, which rises confidence that observed effects of structural changes are globally relevant; it is related to bioisosterism in its focus on specific substructural transformations (Birch et al., 2009; Griffen et al., 2011; Kenny and Sadowski, 2005; Leach et al., 2006; Papadatos et al., 2010; Schultes et al., 2012; Warner et al., 2012). However, consider the following: (i) it goes further in providing quantitative estimates of the changes that result from the application of particular transformations and provides an inverse QSAR; (ii) it models not only bioactivity, but also any chemical, physicochemical, or pharmacokinetic property.

It is an inverse QSAR, which is gaining popularity in the retrospective analysis of large experimental data sets. While much focus was on the differences in properties between structurally related groups of existing compounds, attempts to extend it to the de novo design of structures were limited. Using IE, one looks for trends and central tendencies for novel drugs, mixtures, or properties; however, when one uses MMPA to look at activities that involve some kind of interaction with a binding site, one is looking for exceptions.

8 Conclusions

From these results and discussion, the following conclusions can be drawn:

1. Several criteria, selected to reduce the analysis to a manageable quantity of N-aryloxazolidinone-5-carboxamides, refer to substitutions at positions R1/2, R3–6 on different phenyls and R7. Many classification algorithms are based on IE. For sets of moderate size, an excessive number of results appear compatible with the data and suffer a combinatorial explosion. However, after the EC, the best configuration is that in which the entropy production is most uniformly distributed. Molecular structural elements are ranked according to inhibitory activity: R3 > R6 > R7 > R4 > R5 > R2 > R1. In compound 5, R3 = R6 = R4 = R2 = H, R7 = iPr, R5 = OCH3, and R1 = Ac <1111111>, which was selected as reference. Substances are grouped into five classes. This method avoids the problem of others of continuum variables because for <1111111>, null standard deviation causes a Pearson correlation coefficient of 1. The results agree with the principal component analyses.

2. The periodic law does not satisfy the laws of physics: (i) the inhibitory activity of N-aryloxazolidinone-5-carboxamides is not repeated, as the periodic law states, which is perhaps due to their chemical character; (ii) the order relationships are repeated with exceptions. The analysis forces the statement: The relationships that any compound p has with its neighbor p + 1 are approximately repeated for every period. The periodicity is not general; however, if a natural order of substances is accepted, the periodic law must be phenomenological. The inhibitory activity was not used in the generation of the PT and serves to validate it. Work to be done is the periodic analysis of any other molecular properties (e.g., cytotoxicity) that are not used in the construction of the PT. It would give insight into the possible generality of the PL. The authors are happy with the PT, although it should be tested for all properties: adverse effects, etc. The method is thought for medicinal, nutritional and phylogenetic chemists. However, the obtained PT deserves a broader spectra (e.g., for doctors to mix carboxamides of different classes with dissimilar types of properties: high anti-viral activity, low cytotoxicity, etc.)

3. Code MolClas is a simple, reliable, efficient and fast procedure for molecular classification, based on EC of entropy production. It was written not only to analyze EC, but also to explore the world of molecular classification.

References

Alberts B, Johnson A, Lewis J, Raff M, Roberts K, Walter P. Molecular Biology of the Cell. New York, NY: Garland; 2002.

Ali A, Reddy G.S.K.K., Cao H, Anjum SG, Nalam MNL, Schiffer CA, Rana TM. Discovery of HIV-1 protease inhibitors with picomolar affinities incorporating N–aryl–oxazolidinone–5–carboxamides as novel P2 ligands. J. Med. Chem. 2006;49:7342–7356.

Baez J, Dolan J. Higher dimensional algebra and topological quantum field theory. J. Math. Phys. 1995;36:6073–6105.

Benzecri JP. Paris, France: Dunod; L’analyse des données. 1984;vol. 1.

Birch AM, Kenny PW, Simpson I, Whittamore PRO. Matched molecular pair analysis of activity and properties of glycogen phosphorylase inhibitors. Bioorg. Med. Chem. Lett. 2009;19:850–853.

Castellano G, Torrens F. AIDS destroys immune defences: hypothesis. New Front. Chem. 2014;23:11–20 Classification by information entropy of triterpenoids and steroids of Ganoderma. Phytochemistry.

Castellano G, Tena J, Torrens F. Classification of polyphenolic compounds by chemical structural indicators and its relation to antioxidant properties of Posidonia oceanica (L.) Delile. MATCH Commun. Math. Comput. Chem. 2012;67:231–250.

Castellano G, González-Santander JL, Lara A, Torrens F. Classification of flavonoid compounds by using entropy of information theory. Phytochemistry. 2013;93:182–191.

Castellano G, Lara A, Torrens F. Classification of stilbenoid compounds by entropy of artificial intelligence. Phytochemistry. 2014;97:62–69.

Castellano-Estornell G, Torrens-Zaragozá F. Local anaesthetics classified using chemical structural indicators. Nereis. 2009;2:7–17.

Chakravarty AK. Immunology and immunotechnology. New Delhi, India: Oxford University; 2006.

Cox E. The fuzzy systems handbook. New York, NY: Academic; 1994.

Crans S. On braidings, syllepses and symmetries. Cahiers Topologie Géom. Différentielle Catég. 2000;41(1):2–74.

Griffen E, Leach AG, Robb GR, Warner DJ. Matched molecular pairs as a medicinal chemistry tool. J. Med. Chem. 2011;54:7739–7750.

Halder AK, Jha T. Validated predictive QSAR modeling of N-aryl-oxazolidinone-5-carboxamides for anti-HIV protease activity. Bioorg. Med. Chem. Lett. 2010;20:6082–6087.

Hotelling H. Analysis of a complex of statistical variables into principal components. J. Educ. Psychol. 1933;24:417–441.

Huson DH. SplitsTree: analizing and visualizing evolutionary data. Bioinformatics. 1998;14:68–73.

IMSL. Integrated Mathematical Statistical Library (IMSL). Houston, TX: IMSL; 1989.

Iordache O. Modeling Multi-Level Systems. Berlin, Ger.: Springer; 2011.

Iordache O. Self-Evolvable Systems: Machine Learning in Social Media. Berlin, Ger: Springer; 2012.

Iordache O, Corriou JP, Garrido-Sánchez L, Fonteix C, Tondeur D. Neural network frames. Application to biochemical kinetic diagnosis. Comput. Chem. Eng. 1993;17:1101–1113.

Jarvis RA, Patrick EA. Clustering using a similarity measure based on shared nearest neighbors. IEEE Trans. Comput. 1973;C22:1025–1034.

Jolliffe IT. Principal Component Analysis. New York, NY: Springer; 2002.

Katzung BG, ed. Basic and Clinical Pharmacology. New Delhi, India: McGraw-Hill; 2004.

Kaufmann A. Paris, France: Masson; Introduction à la théorie des sous-ensembles flous. 1975;vol. 3.

Kenny PW, Sadowski J. Structure modification in chemical databases. In: Oprea TI, ed. Chemoinformatics in Drug Discovery. Weinheim, Ger: Wiley-VCH; 2005:271–285.

Kramer R. Chemometric Techniques for Quantitative Analysis. New York, NY: Marcel Dekker; 1998.

Kullback S. Information Theory and Statistics. New York, NY: Wiley; 1959.

Kundu S. The min–max composition rule and its superiority over the usual max–min composition rule. Fuzzy Set. Syst. 1998;93:319–329.

Lambert-Torres G, Pereira Pinto JO, Borges da Silva LE. Minmax techniques. In: Wiley Encyclopedia of Electrical and Electronics Engineering. New York, NY: Wiley; 1999.

Leach AG, Jones HD, Cosgrove DA, Kenny PW, Ruston L, MacFaul P, Wood JM, Colclough N, Law B. Matched molecular pairs as a guide in the optimization of pharmaceutical properties; a study of aqueous solubility, plasma protein binding and oral exposure. J. Med. Chem. 2006;49:6672–6682.

Leinster T. Higher Operands, Higher Categories. Cambridge, UK: Cambridge University Press; 2004.

Liu X, Gan M, Dong B, Zhang T, Li Y, Zhang Y, Fan X, Wu Y, Bai S, Chen M, Yu L, Tao P, Jiang W, Si S. 4862 F, a new inhibitor of HIV-1 protease, from the culture of Streptomyces I03A-04862. Molecules. 2013;18:236–243.

Page RDM. Program TreeView. Glasgow, UK: Universiy of Glasgow; 2000.

Papadatos G, Alkarouri M, Gillet VJ, Willett P, Kadirkamanathan V, Luscombe CN, Bravi G, Richmond NJ, Pickett SD, Hussain J, Pritchard JM, Cooper AWJ, Macdonald SJF. Lead optimizatioin using matched molecular pairs: inclusion of contextual information for enhanced prediction of hERG inhibition, solubility, and lipophilicity. J. Chem. Inf. Model. 2010;50:1872–1886.

Patra SK, Mandal AK, Pal MK. State of aggregation of bilirubin in aqueous solution: principal component analysis approach. J. Photochem. Photobiol. A. 1999;122:23–31.

Schultes S, de Graaf C, Berger H, Mayer M, Steffen A, Haaksma EEJ, de Esch IJP, Leurs R, Krämer O. A medicinal chemistry perspective on melting point: matched molecular pair analysis of the effects of simple descriptors on the melting point of drug-like compounds. Med. Chem. Comm. 2012;3:584–591.

Shannon CE. A mathematical theory of communication: part I, discrete noiseless systems. Bell Syst. Tech. J. 1948a;27:379–423.

Shannon CE. A mathematical theory of communication: part II, the discrete channel with noise. Bell Syst. Tech. J. 1948b;27:623–656.

Shaw PJA. Multivariate Statistics for the Environmental Sciences. New York, NY: Hodder-Arnold; 2003.

Tondeur D, Kvaalen E. Equipartition of entropy production. An optimality criterion for transfer and separation processes. Ind. Eng. Chem. Fundam. 1987;26:50–56.

Torrens F, Castellano G. Periodic classification of local anaesthetics (procaine analogues). Int. J. Mol. Sci. 2006;7:12–34.

Torrens F, Castellano G. Classification of complex molecules. In: Hassanien A.-E., Abraham A, eds. Berlin, Ger: Springer; 243–315. Foundations of Computational Intelligence. 2009a;vol. 5.

Torrens F, Castellano G. Modelling of complex multicellular systems: tumour–immune cells competition. Chem. Cent. J. 2009b;3(Suppl. I) 75–1-1.

Torrens F, Castellano G. Table of periodic properties of human immunodeficiency virus inhibitors. Int. J. Comput. Intell. Bioinformatics Syst. Biol. 2010;1:246–273.

Torrens F, Castellano G. Information entropy and the table of periodic properties of local anaesthetics. Int. J. Chemoinform. Chem. Eng. 2011a;1(2):15–35.

Torrens F, Castellano G. Molecular classification of thiocarbamates with cytoprotection activity against human immunodeficiency virus. Int. J. Chem. Model. 2011b;3:269–296.

Torrens F, Castellano G. Structural classification of complex molecules by artificial intelligence techniques. In: Castro ED, Haghi AK, eds. Advanced Methods and Applications in Chemoinformatics: Research Progress and New Applications. Hershey, PA: IGI Global; 2012a:25–91.

Torrens F, Castellano G. Complexity, emergence and molecular diversity via information theory. In: Orsucci F, Sala N, eds. Complexity Science, Living Systems, and Reflexing Interfaces: New Models and Perspectives. Hershey, PA: IGI Global; 2012b:196–208.

Torrens F, Castellano G. Molecular diversity classification via information theory: a review. ICST Tran. Compl. Syst. 2012c;12(10-12):e41–e48.

Torrens F, Castellano G. Structural classification of complex molecules by information entropy and equipartition conjecture. In: Putz MV, ed. Chemical Information and Computational Challenges in 21st Century. New York, NY: Nova; 2012d:101–139.

Torrens F, Castellano G. Molecular classification by information theoretic entropy: oxadiazolamines as potential therapeutic agents. Curr. Comput. Aided Drug Des. 2013a;9:241–253.

Torrens F, Castellano G. Molecular classification of 5-amino-2-aroylquinolines and 4-aroyl-6,7,8-trimethoxyquinolines as highly potent tubulin polymerization inhibitors. Int. J. Chemoinform. Chem. Eng. 2013b;3(2):1–26.

Torrens, F., Castellano, G., in press-a. Molecular classification, diversity and complexity via information entropy. In: Stavrinides, S.G., Banerjee, S., Caglar, H., Ozer, M. (Eds.), Chaos and Complex Systems, vol. 4. Springer, Berlin, Ger.

Torrens, F., Castellano, G., in press-b. AIDS destroys immune defences: hypothesis. In: New Frontiers in Chemistry.

Tripathi KD. Essentials of Medical Pharmacology. New Delhi, India: Jaypee Brothers; 2003.

Tryon RC. A multivariate analysis of the risk of coronary heart disease in Framingham. J. Chronic Dis. 1939;20:511–524.

Varmuza K. Pattern Recognition in Chemistry. New York, NY: Springer; 1980.

Volberding PA, Deeks SG. Antiretroviral therapy and management of HIV infection. Lancet. 2010;376:49–62.

Wang RR, Gao YD, Ma CH, Zhang XJ, Huang CG, Huang JF, Zheng YT. Mangiferin, an anti-HIV-1 agent targeting protease and effective against resistant strains. Molecules. 2011;16:4264–4277.

Warner DJ, Bridgland-Taylor MH, Sefton CE, Wood DJ. Prospective prediction of antitarget activity by matched molecular pairs analysis. Mol. Inform. 2012;31:365–368.

White H. Neural network learning and statistics. AI expert. 1989;4(12):48–52.

Xu J, Hagler A. Chemoinformatics and drug discovery. Molecules. 2002;7:566–600.

Zhang J, Hou T, Liu Y, Chen G, Yang X, Liu JS, Wang W. Systematic investigation on interactions for HIV drug resistance and cross-resistance among protease inhibitors. J. Proteome Sci. Comput. Biol. 2012;1(2):1–1-8.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset