Evaluation Of The Statistics-Based Ames Mutagenicity Model Sarah Nexus And Interpretation Of The Results Obtained Alex Cayley [email protected] Summary • What is a “statistics-based Ames mutagenicity model” and why are they important? • How can we judge how useful these models will be? (validation statistics, expert interpretation) • How well does Sarah Nexus (SX) version 1.1 predict and how can results from the program be interpreted to increase performance? • Worked examples of SX version 1.1 predictions What Are Statistical Models? • In our sphere no real clear-cut definition but… • Data for a given endpoint is fed in to the model builder (binary or continuous) • An algorithm based on one or multiple descriptors is used to distinguish compound categories or predict values • Any patterns found in the data are the result of statistical relationships and are machine learnt and NOT human intervention • Definition is important for ICH M7 guidance compliance • May not be as important elsewhere and the distinction is blurring… What Are Statistical Models? • In our sphere no real clear-cut definition but… • Data for a given endpoint is fed in to the model builder (binary or continuous) • An algorithm based on one or multiple descriptors is used to distinguish compound categories or predict values • Any patterns found in the data are the result of statistical relationships and are machine learnt and NOT human intervention • Definition is important for ICH M7 guidance compliance • May not be as important elsewhere and the distinction is blurring… ICH M7 Guidance What Makes A Good Statistical Model? What Makes A Good Statistical Model? Super Expert Scientist “right” = Every time Explanation = Full and Reasoned “right” = Sometimes Explanation = None 7 What Makes A Good Statistical Model? Super Expert Scientist Expert Scientist “right” = Every time Explanation = Full and Reasoned “right” = Most Times Explanation = None “right” = Most Times Explanation = Full and Reasoned “right”= Sometimes Explanation = Some “right” = Sometimes Explanation = None 8 What Makes A Good Statistical Model Test Set 1 2 3 4 Pharma A Pharma B Pharma C Pharma D 14 Pharma N Training Data QSAR Model Performance Stats BA SEN SPEC 72 76 68 63 38 89 72 64 80 75 65 85 xx xx Validation xx Expert Interpretation Validation of (Q)SAR Models In The Literature Validation Of Sarah Nexus (v1.1) Specificity = 69-91% (83% mean) Sensitivity = 38-68% (55% mean) Balanced Accuracy = Sens + Spec 2 Positive <50 Incorrectly assign 3-4 = ~10% = 62-77% (69% mean) Prediction Scenarios in Sarah Nexus Overall Prediction + Confidence Overruled Hypotheses + Confidence Positive Negative Negative Predictions Negative Predictions Positive Predictions Positive Predictions Confidence Correlation With Predictivity Equivocal Predictions Out Of Domain Predictions An Update Sarah Nexus V1.2 Data Set ID SIZE 1 2 3 4 5 POS 879 513 4018 2862 4040 NEG 279 97 576 170 725 BAC 600 416 3442 2692 3315 ACC 72 74 67 67 62 68 Mean SEN 74 81 78 83 81 SPEC 68 62 51 48 33 53 PPV 76 86 82 85 90 84 NPV 55 48 32 17 42 TP 85 91 91 96 87 TN 143 44 231 57 186 FP 372 284 2261 1661 2444 FN 115 48 488 283 260 COV 67 27 224 61 371 EQ 79 79 80 72 81 78 OOD 136 76 498 646 430 EM 46 34 316 192 374 51 0 158 28 243 Sarah Nexus V2.0.1 Data Set ID SIZE 1 2 3 4 5 Mean POS 879 513 4018 2862 4040 NEG 279 97 576 170 725 BAC 600 416 3442 2692 3315 ACC 77 72 69 67 68 71 SEN 76 76 78 83 80 SPEC 80 64 56 48 50 60 PPV 74 80 82 85 86 81 NPV 60 44 33 17 44 TP 89 90 92 96 89 TN 188 51 254 56 282 FP 361 253 2262 1654 2205 FN 126 65 507 283 361 COV 46 29 198 61 278 EQ 82 78 80 72 77 78 OOD 136 84 524 621 590 EM 22 31 272 187 343 121 0 313 27 389 Conclusions • Proprietary data sets can give a good indication of the performance of statistical prediction systems • Sarah Nexus performs well when tested against a number of different proprietary validation sets • Additional information provided for each prediction is also important in aiding the user to make a final decision • Negative predictions based purely on negative hypotheses are more reliable • Positive predictions with a higher confidence are more reliable Barber et al.; Reg. Tox. Pharm.; 76; 7-20 (2016) http://www.sciencedirect.com/science/article/pii/S0273230015301410 Acknowledgements • Sandy Weiner • Joerg Wichard • Amanda Giddings • Susanne Glowienke • Alexis Parenty • Alessandro Briggo • Hans-Peter Spirkl • Alexander Amberg • Ray Kemper • Nigel Greene • Chris Barber • Thierry Hanser • Alex Harding • Crina Heghes • Jonathan Vessey • Stephane Werner Questions? Work in progress disclaimer This document is intended to outline our general product direction and is for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon. The development, release, and timing of any features or functionality described for Lhasa Limited’s products remains at the sole discretion of Lhasa Limited. 25
© Copyright 2026 Paperzz