CompanyProductsScienceSupportWhatsnew
[Product Releases]
Index
[Blog]

Most recent post

[News]

Can we trust docking results?
Sept 2010

IBM Systems and Technology Group releases a white paper with eHiTS and Cell
Oct 2008

EPA's ToxCastTM project will use SimBioSys' eHiTS as docking engine
Nov, 2007

[Events]

243rd ACS
Mar 25-29, 2012
San Diego, CA
see >> more

Index

eHiTS LASSO:
L
igand Activity in Surface Similarity Order

eHiTS LASSO: results, trained on just 2% of the actives

Following the ligand diversity classification of Hert et. al J. Chem. Inf Model, 46, 462-470, 2006, three groups of ligands with low, medium and high Mean Pairwise Similarity (MPS) were extracted from the MDDR database. For each of these 3 groups 5 families, i.e. 15 sets in total were chosen. Please see detailed results in the table below or a quick view on the chart above.

Data specifications:

Activity class

Activity Key

# Compounds

Diversity (MPS)*

% recovered in top 2%

% recovered in top 5%

% recovered in top 10%

High Diversity

nitric oxide synthase inhibitors

12464

377

0.19

25.37%

37.07%

48.78%

muscarinic (M1) agonists

9249

848

0.21

45.02%

63.41%

70.69%

aromatase inhibitors

75721

513

0.23

24.76%

36.43%

52.38%

dopamine -hydroxylase inhibitors

31281

95

0.23

65.33%

92.00%

98.67%

aldose reductase inhibitors

43210

882

0.23

21.02%

29.30%

38.24%

Medium Diversity

5 HT2B antagonists

6249

90

0.31

31.67%

78.33%

91.67%

carbonic anhydrase inhibitors

16200

255

0.31

39.18%

47.94%

61.34%

protease inhibitors

78330

574

0.31

10.85%

13.45%

18.87%

CRF antagonists

6215

254

0.32

46.52%

60.43%

77.83%

thrombin inhibitors

37110

803

0.32

19.62%

34.72%

35.85%

Low Diversity

cephalosporins

64200

1312

0.5

72.90%

85.33%

90.25%

adenosine (A1) agonists

7707

88

0.52

94.20%

100.00%

100.00%

adenosine (A2) agonists

7708

71

0.54

98.25%

98.25%

98.25%

monocyclic -lactams

64100

76

0.55

97.10%

98.55%

100.00%

vitamin D analogues

75755

279

0.57

80.41%

87.11%

94.33%

Cleaned MDDR

Cleaned MDDR

78050


0.200




* Diversity measurements come from Hert's paper and were calculated on a slightly different dataset. Our dataset used the same activity classes, however duplicate structures were removed used our own feature vector measurements, which will remove structures that may not be identical but which produce the same feature vector. This is the reason for the reduced data size.

Data preparation:

For each molecule in the MDDR database, the feature vector was calculated. Any duplicate feature vectors were removed. This was to remove any duplicate molecules and to ensure a dataset that was usable in the LASSO testing experiment. Each set of ligands were extracted from the cleaned MDDR database and used in the experiments. 2% of the actives for each set were randomly selected and used to train the eHiTS LASSO neural nets. The trained neural nets were used to screen the full cleaned MDDR database.



[LASSO Links]

Copyright © 2011 SimBioSys Inc., All rights reserved.