Introduction
Taxoids are groups of diterpenoid cyclodecanes isolated from Taxus baccata [1], Taxus brevifolia [2], Taxus Canadensis [3], Taxus chinensis [4], Taxus cuspidate [5, 6], Taxus floridana [7], Taxus sumatrana [8] or from Taxus wallichiana [9]. Some taxoids act to decrease the critical concentration of tubulin required for assembly [10, 11] through the inhibition of microtubule disassembly [12, 13]. Other taxoids reduce the CaCl2-induced depolymerization of microtubules [14] or increase the cellular accumulation of vincristine in multi-drug resistant tumour cells [14, 15]. Since their discovery, taxoids have been used in treatment of polycystic kidney diseases [16] and neoplasms (ovarian cancer [17], breast cancer [18], lung non-small cell cancer [19], prostate cancer [20], and head and neck cancer [21]). A major disadvantage of these compounds is their poor solubility [22, 23]. Many studies are being conducted using the structure-activity relationships approach and/or synthetic modification in order to increase activity and solubility of new taxoid analogues [24, 25]. Thirty-five cytotoxic taxoids, compounds isolated through chromatographic purification of taxoid fraction from the steam of Taxus cuspidate Sieb. Et Zucc. Var nana Rehder [25, 26], were studied using comparative molecular field analysis (CoMFA) [6]. Statistical characteristics of the model reported by Morita et al. [6] are: r2=0.979, r2cv-loo=0.818 s=0.196, F=267.621, n=35, v=5 Eq (1), where r2 – squared correlation coefficient, r2cv-loo – squared cross-validation leave-one-out coefficient, s – standard error of the estimation, F – Fisher parameter, n – sample size, and v – number of variables. Starting from the successful results obtained by an original molecular descriptors family on structure-activity relationships (MDF-SAR) methodology [27-30], the aim of the research was to investigate and to assess the estimation and prediction abilities of the MDF-SAR approach on a sample of taxoids.
Material and methods
Data set: taxoids
A sample of thirty-four taxoids was investigated. The compounds analyzed were as follows: taxol (t01), 10-deaceltyl-taxol (t02), taxol B (t03), 10-deaceltyl-taxol B (t04), taxol C (t05), 10-deaceltyl-taxol C (t06), taxuspinanane A (t07), taxol D (t08), baccatin III (t09), 9-dihydro-14-acetyl baccatin III (t10), taxuspinanane C (t11), 7,9,10-deacetyl baccatin VI (t12); taxuspinanane D (t13), brevifoliol (t14), taxusin (t15); 2a-deacetoxy taxinine J (t16); taxinin (t17); taxa-4(20), 11-diene-2a, 5a, 9a, 13a-pentaol pentaacetate (t18); taxa-4(20), 11-diene-5a, 7b, 9a, 10b, 13a-pentaol pentaacetate (t19); taxa-4(20), 11-diene-5a, 7b, 9a, 10b, 13a-pentaol 7b, 9a, 10b-triacetate (t20); 2a-a-methyl butyryloxy-5a-7b, 10b-triacetyl-(4), 20, 11-taxadine (t21); taxa-4(20), 11-diene-5a, 7b, 10b, 13a-pentaol 7b, 9a, 10b, 13a tetra-acetate (t22); taxinin B (t23); decinnamoyl taxinine J (t24), taxuspinanane K (t25), taxuspine F (t26); taxuspinanane G (t27), taxuspine L (t28); taxchin A (t29); taxinine M (t30), taxgifine (t31); taxa-4(20), 11-taxadiene-2a, 5a, 10b, 14b-(s)2’-methyl butyrate (t32); 1b-hydroxy-baccatin I (t33); and taxuspinanane H (t34). The growth inhibition activity expressed as log 1/IC50 (where IC50 is the concentration of a taxoid that is required for 50% growth inhibition in vitro) was taken from a previously reported study [6]. The generic structures of the taxoids are presented in Figure 1. The abbreviation (Abb.), the substituent (Si, where i=1, …, 6) and experimental growth inhibition activity (Yobs) are presented in Table I.
Computational methodology
The growth inhibition activity of taxoids was modelled using the MDF-SAR approach [31]. The structure of each taxoid was drawn up using HyperChem software [32]. The observed inhibition activity was stored in the taxoids.txt file. The molecular descriptors family was generated, the molecular descriptors being calculated strictly based on the information obtained from the compounds' 2D and 3D structures. Each descriptor had an individual seven-letter name expressing its modality of construction [31]. Starting with generated molecular descriptors, an algorithm was applied in order to identify the best MDF-SAR models with one and more than one variable. In identification of the regression models with higher models’ goodness-of-fit, a step-wise approach, forward selection was used (it started with one variable in the model, trying out the variables one by one, and including them if the obtained model was statistically significant). The following internal validation approaches were applied for models’ assessment: statistical characteristics of the regression model, leave-one-out analysis [33], and correlated correlation analysis (Steiger’s and Fisher Z tests) [34]. Pearson, Spearman and semi-quantitative correlation coefficients [35] were calculated for comparison of the MDF-SAR model with higher squared correlation coefficient with previously reported model [36].
Results
One univariate MDF-SAR model and two multivariate models, one with three and the other with five descriptors, proved to have good estimated and prediction abilities. A summary description of the models is presented in Table II, and the statistical analysis of the models is presented in Table III. Estimation abilities of the models from Eq(2)-Eq(4) are presented in terms of activity estimated by the model (Y^Eq(2), Y^Eq(3), and Y^Eq(4)) and residuals (REq(2), REq(3), REq(4)) in Table IV. In order to test the statistical hypothesis that the correlation coefficient obtained by MDF-SAR models from Eq(2)-Eq(4) are not statistically different, Steiger’s Z test at a significance level of 5% was applied. The following results were obtained: • rEq(2) vs. rEq(3): Z=3.4891 (p<0.0001), • rEq(2) vs. rEq(4): Z=5.5845 (p<0.0001), • rEq(3) vs. rEq(4): Z=3.0192 (p<0.003). The prediction ability of the MDF-SAR model from Eq(4) was assessed by randomly splitting the sample into training and test sets (23 compounds in training and 11 compounds in test). The equation and statistical characteristics are: Y^ = –7.41 – 0.30 × lmPrVQt – 0.03 × iNMMkQg – – 1.10 × lmPrsCg + 216.40 × IIMdPQg + 0.75 × IHDrFHt r2training =0.9728; Ftraining =122 (p<0.0001); r2test =0.9752; Ftest =35 (p<0.001) [5]. The graphical representation of the models in training and test sets when the number of compounds in the training set was of 2/3 of the sample size is presented in Figure 2. The comparison between the MDF-SAR model with five descriptors and the previously reported CoMFA model [6] was done by applying a correlated correlation analysis using the Pearson, semi-quantitative and Spearman methods [35]. The results expressed as correlation coefficients are presented in Table V. The graphical representation of the growth inhibition activity measured experimentally and estimated by CoMFA [6] and MDF-SAR model with five descriptors is presented in Figure 3.
Discussion
The aim of the research was reached: the molecular descriptors family on structure-activity relationship proved to be a valid approach in characterization of taxoids’ growth inhibition activity based on information obtained strictly from 2D and 3D structures. Due to the possibility of in silico experiments on new taxoids, structure-activity methods are used in order to obtain compounds with increased activity and solubility and decreased toxicity [37-39]. The differences of these methods are at the level of descriptors type, construction and calculation. The MDF-SAR method is unique due to generation and calculation of descriptors based on the topological and geometrical model of compounds. Analyzing the molecular descriptors used by the MDF-SAR models (Table II) it can be observed that one descriptor appears in all models (IHDrFHt), showing that the activity of taxoids is related to compounds' topology, and it is dependent on the number of directly bonded hydrogens. As expected, with increase of the number of descriptors, the estimation abilities increase. The models with three descriptors revealed that the inhibition activity of taxoids is related to compounds’ geometry and topology, and it is related to three atomic properties: cardinality, charge and number of directly bonded hydrogens. The MDF-SAR model with five variables showed that the growth inhibition activity of studied taxoids depends on compounds’ geometry (iNMMkQg, lmPrsCg, IIMdPQg) as well as on topology (lmPrVQt, IHDrFHt). This model also revealed that the partial charges (lmPrVQt, iNMMkQg, IIMdPQg), the number of directly bonded hydrogens (IHDrFHt) and the cardinality (lmPrsCg) are the atomic properties that influence the growth inhibition activity. All MDF-SAR models were statistically significant (Table III). The estimation abilities of the models is sustained by the values of the correlation coefficient and associated adjusted squared correlation coefficient, which with one exception (for the models with one descriptor) were greater than 0.90. Furthermore, the sum of residuals of the MDF-SAR models were very low (0.0000 for Eq(2), 0.0057 for Eq(3), and 0.0045 for Eq(4), respectively). In statistical terms, it can be concluded that there is a very good level of association between growth inhibition activity and the three descriptors used by the model from Eq(3) and the five descriptors used by the model from Eq(4), respectively. 94% of growth inhibition activity of studied taxoids can be explained by its linear relationship with the variation of the molecular descriptors used by Eq(3) as predictors. A better result is obtained by Eq(4), where 98% of growth inhibition activity of studied taxoids can be explained by its linear relationship with the variation of molecular descriptors of the model. The analysis of the internal prediction ability of the MDF-SAR models on leave-one-out analysis allows one to calculate the predictive power of the MDF-SAR models. Good predictive power of MDF-SAR models are obtained by Eq(3) and Eq(4), the values of r2loo being greater than or equal to 0.93 (Table III). The small difference between squared correlation coefficient and squared cross-validation leave-one-out coefficient of 0.01 (obtained for Eq(3) and Eq(4)) sustained the stability of the multivariate MDF-SAR models. The correlated correlation analysis revealed that the models with three and five descriptors obtained statistically significantly higher correlation coefficient compared with the models with one descriptor (p<0.0001). The models with five descriptors obtained a correlation coefficient statistically significantly higher than the model with three descriptors (p<0.003). The robustness and predictivity assessment [40, 41] of the model with five descriptors (Eq(4)) showed that the model is stable and valid: the intercept and coefficients of the model obtained in training and test set analysis (Eq(5)) fell within the confidence intervals of the intercept and coefficients of the model from Eq(4) (Table III). Moreover, the correlation coefficients in training and test sets are within the 95% confidence interval of the model from Eq(4) (Figure 3, Table III). The MDF-SAR model with five descriptors is satisfactory and stable in training versus test analysis, proving the model’s robustness. Taking into consideration statistical performances of the MDF-SAR models, it can be concluded that the model with five descriptors is a better model than the models with three and one descriptors, respectively. The comparison of the model with five descriptors with the previously reported model [6] revealed that their abilities are similar (Table V). Analyzing the semi-quantitative and Spearman correlation coefficients, it can be observed that the model from Eq(4) obtained slightly better results in terms of squared correlation coefficient. The absence of statistically significant differences between the CoMFA [6] and MDF-SAR model with five descriptors is seen also in the graphical representation presented in Figure 3. The difference between these models consists of the modality of descriptors generation and calculation and the approach used. The MDF-SAR model with five descriptors has been included in the MDF-SAR library and could be used to predict the growth inhibition activity of other taxoids [42]. The activity of new taxoids can be obtained by using a virtual environment free of experimental accidents and measurements errors, opening a new pathway in activity characterization of compounds. This environment has a real potential of clinical applications in the first step of knowledge translation (generation of evidence from research [43]) in the design of new drugs with higher curative and lower adverse effects. Any researcher could freely use the predictive environment by drawing the compound as a *.hin file. Further research will focus on external validation of the model with five descriptors, through assessment of taxoids not included in the process of model development. In conclusion three molecular descriptors family on structure-activity relationships models, one with one descriptor and the others with three and five descriptors, with good statistical characteristics were obtained. The MDF-SAR model with five descriptors obtained a correlation coefficient significantly greater than the other MDF-SAR models. According to the MDF-SAR model with five descriptors, the growth inhibition activity of studied taxoids is of geometric and topological nature, being related to partial charges of compounds, number of directly bonded hydrogens and cardinality. Even if the correlation coefficient obtained by the MDF-SAR model with five descriptors is similar to the correlation coefficient obtained by the previously reported model, the applied validation methods demonstrate its stability and reliability.
Acknowledgments
This research was supported by UEFISCSU Romania through grants (ID_458 and ID_1051).
References
1. Khosroushahi AY, Valizadeh M, Ghasempour A, et al. Improved Taxol production by combination of inducing factors in suspension cell culture of Taxus baccata. Cell Biol Int 2006; 30: 262-9. 2. Wani MC, Taylor HL, Wall ME, Coggon P, McPhail AT. Plant antitumor agents. VI. Isolation and structure of taxol, a novel antileukemic and antitumor agent from Taxus brevifolia. J Am Chem Soc 1971; 93: 2325-7. 3. Zamir LO, Zhang J, Wu J, Sauriol F, Mamer O. Five novel taxanes from taxus canadensis. J Nat Prod 1999; 62: 1268-73. 4. Yuangang Z, Yujie F, Shuangming L, Rui S, Qingyong L, Gunter S. Rapid separation of four main taxoids in Taxus species by a combined LLP-SPE-HPLC (PAD) procedure. J Sep Sci 2006; 29: 1237-44. 5. Naill MC, Roberts SC. Culture of isolated single cells from taxus suspensions for the propagation of superior cell populations. Biotechnol Lett 2005; 27: 1725-30. 6. Morita H, Gonda A, Wei L, Takeya K, Itokawa H. 3D QSAR analysis of taxoids from Taxus cuspidate var. nana by comparative molecular field approach. Bioorg Med Chem Lett 1997; 7: 2387-92. 7. Rao KV, Bnakuni RS, Juchum J, Davies RM. A Large scale process for paclitaxel and other taxanes from the needles of taxus x media hicksii and taxus floridana using reverse phase column chromatography. J Liq Chromatogr Relat Technol 1996; 19: 427-47. 8. Shen YC, Lin YS, Hsu SM, et al. Tasumatrols P-T, five new taxoids from Taxus sumatrana. Helv Chim Acta 2007; 90: 1319-29. 9. Joshi BS, Roy R, Chattopadhyay SK, Madhusudanan KP. An NMR and LC-MS based approach for mixture analysis involving taxoid molecules from Taxus wallichiana. J Mol Struct 2003; 645: 235-48. 10. Schiff PB, Fant J, Horwitz SB. Promotion of microtubule assembly in vitro by taxol. Nature 1979; 277: 665-7. 11. Pineda O, Farra`s J, Maccari L, Manetti F, Botta M, Vilarrasa J. Computational comparison of microtubule-stabilising agents laulimalide and peloruside with taxol and colchicines. Bioorg Med Chem Lett 2004; 14: 4825-9. 12. Schifft PB, Horwitz SB. Taxol stabilizes microtubules in mouse fibroblast cells. Proc Natl Acad Sci USA 1980; 77: 1561-5. 13. Pazdur R, Kudelka AP, Kavanagh JJ, Cohen PR, Raber MN. The taxoids: paclitaxel (Taxol®) and docetaxel (Taxotere®). Cancer Treat Rev 1993; 19: 351-86. 14. Kobayashi J, Hosoyama H, Wang XX, et al. Effects of taxoids from Taxus cuspidate on microtubule depolymerization and vincristine accumulation in MDR cells. Bioorg Med Chem Lett 1997; 7: 393-8. 15. Kingston DG, Molinero AA, Rimoldi GM. The taxane diterpendoids. In: Progress in the Chemistry of Organic Natural Products. Vol. 61. Herz W, Moore RE, Kirby GW, Steglich W, Tamm C (eds). Springer, Wien 1993; 1-206. 16. Woo DD, Miao SY, Pelayo JC, Woolf AS. Taxol inhibits progression of congenital polycystic kidney disease. Nature 1994; 368: 750-3. 17. Clamp AR, Mäenpää J, Cruickshank D, et al. SCOTROC 2B: feasibility of carboplatin followed by docetaxel or docetaxel-irinotecan as first-line therapy for ovarian cancer. Br J Cancer 2006; 94: 55-61. 18. Steger GG, Galid A, Gnant M, et al.; ABCSG-14. Pathologic complete response with six compared with three cycles of neoadjuvant epirubicin plus docetaxel and granulocyte colony-stimulating factor in operable breast cancer: results of ABCSG-14. J Clin Oncol 2007; 25: 2012-8. 19. LeCaer H, Barlesi F, Robinet G, et al. An open multicenter phase II trial of weekly docetaxel for advanced-stage non-small-cell lung cancer in elderly patients with significant comorbidity and/or poor performance status: The GFPC 02-02b study. Lung Cancer 2007; 57: 72-8. 20. Beer TM, Ryan CW, Venner PM, et al.; ASCENT Investigators. Double-blinded randomized study of high-dose calcitriol plus docetaxel compared with placebo plus docetaxel in androgen-independent prostate cancer: a report from the ASCENT Investigators. J Clin Oncol 2007; 25: 669-74. 21. Samlowski WE, Moon J, Kuebler JP, et al. Evaluation of the combination of docetaxel/carboplatin in patients with metastatic or recurrent squamous cell carcinoma of the head and neck (SCCHN): A Southwest Oncology Group phase II study. Cancer Invest 2007; 25: 182-8. 22. Vaishampayan U, Parchment RE, Jasti BR, Hussain M. Taxanes: an overview of the pharmacokinetics and pharmacodynamics. Urology 1999; 54 (Suppl 6A): 22-9. 23. Hennenfent KL, Govindan R. Novel formulations of taxanes: a review. Old wine in a new bottle? Ann Oncol 2006; 17: 735-49. 24. Guéritte F. General and recent aspects of the chemistry and structure activity relationships of taxoids. Curr Pharm Des 2001; 7: 1229-49. 25. Morita H, Gonda A, Wei L, Yamamura Y, Takeya K, Itokawa H. Taxuspinananes A and B, new taxoids from Taxus cuspidata var. nana. J Nat Prod 1997; 60: 390-2. 26. Morita H, Gonda A, Wei L, et al. Four new taxoids from Taxus cuspidata var. nana. Planta Med 1998; 64: 183-6. 27. Bolboacă SD, Jäntschi L. Molecular descriptors family on structure-activity relationships: modeling herbicidal activity of substituted triazines class. Bulletin of University of Agricultural Sciences and Veterinary Medicine. Agriculture 2006; 62: 35-40. 28. Jäntschi L, Bolboaca SD. Results from the use of molecular descriptors family on structure property/activity relationships. Int J Mol Sci 2007; 8: 189-203. 29. Bolboaca SD, Jäntschi L. Data mining on structure-activity/property relationships models. WASJ 2007; 2: 323-32. 30. Bolboacă SD, Jäntschi L. Modelling the property of compounds from structure: statistical methods for models validation. Environ Chem Lett in press. 31. Jäntschi L. Molecular descriptors family on structure activity relationships 1. Review of the methodology. Leonardo El J Pract Technol 2005; 4: 76-98. 32. ***, HyperChem, Molecular Modelling System [Internet page]; ©2003, Hypercube [about three screens]; [cited 2007 November]. Available from: URL: http://www.hyper.com/. 33. ***, Leave-one-out Analysis. ©2005, Virtual Library of Free Software [cited 2007 May]. Available from: URL: http://l.academicdirect.org/Chemistry/SARs/MDF_SARs/loo/. 34. Steiger JH. Tests for comparing elements of a correlation matrix. Psychol Bull 1980; 87: 245-51. 35. Bolboacă SD, Jäntschi L. Pearson versus Spearman, Kendall’s Tau correlation analysis on structure-activity relationships of biologic active compounds. Leonardo J Sci 2006; 9: 179-200. 36. Bolboacă SD, Jäntschi L. Structure versus biological role of substituted thiadiazole- and thiadiazoline- disulfonamides. Studii si Cercetari de Biologie, Seria biologie vegetala 2007; 12: 50-6. 37. Braga SF, Galva~o DS. A structure-activity study of taxol, taxotere, and derivatives using the electronic indices methodology (EIM). J Chem Inf Comput Sci 2003; 43: 699-706. 38. Ojima I, Lin S, Chakravarty S, et al. Syntheses and structure-activity relationships of novel nor-seco taxoids. J Org Chem 1998; 63: 1637-45. 39. Fang WS, Liang XT. Recent progress in structure activity relationship and mechanistic studies of taxol analogues. Mini Rev Med Chem 2005; 5: 1-12. 40. Shao J. Linear model selection by cross-validation. J Am Stat Assoc 1993; 88: 486-94. 41. Golbraikh A, Shen M, Xiao Z, Xiao YD, Lee KH, Tropsha A. Rational selection of training and test sets for the development of validated QSAR models. J Comput Aided Mol Des 2003; 17: 241-253. 42. ***, MDF Predictor, ©2005, Virtual Library of Free Software [cited 2007 October]. Available from: URL: http://l.academicdirect.org/Chemistry/SARs/MDF_SARs/sar/. 43. Lockyer J, Gondocz ST, Thivierge RL. Knowledge translation: the role and place of practice reflection. J Contin Educ Health Prof 2004; 24: 50-6.