Developing a new drug is a costly and risky affair. Recent estimates indicated that a new drug can cost US$985 million to reach the market. Toxicities cause about 70% of drug failures, a worrying concern during times when antimicrobial resistance is on the rise and the world is enveloped in the COVID-19 pandemic.
A toxicity test generates data on the adverse effects (toxicity endpoints) of a drug on human or animal health as well as on the environment. Such data from animal models (in vivo) and studies using different cell lines (in vitro) are used to support various regulatory submissions and approvals. Toxicity data can save lives and prevent injuries as well as wasteful investments. Pharmaceutical companies are adopting the “fail early, fail fast” mentality by doing everything possible, including using biomarkers, to distinguish failure from success, thus helping to avoid wasteful clinical trials.
While holding much of their unpublished data as proprietary information, companies still find the validation of multiple biomarkers to be a slow and expensive process, hence negating much of the “fail cheap” goal. Additionally, more and more often, ethical committees are putting a brake on the use of animal models. Therefore, with large, free chemical compound libraries, the replacement of costly in vivo toxicity tests with computational methods such as in silico prediction tools is eminent.
Moving to in silico and the use of AI/ML in drug toxicology:
Pharmaceutical companies use in silico methods and artificial intelligence & machine learning (AI/ML) techniques to select drug candidates that are likely to succeed in animal models, thereby avoiding late-stage withdrawal. Critical regulatory decisions are also based on data generated by these two approaches.
In silico methods that are either qualitative classification, quantitative regression, or read-across rely on computational techniques such as databases to predict toxicity. While qualitative classification identifies a compound as either toxic or non-toxic, quantitative structure-activity relationship (QSAR) modeling uses molecular descriptors to predict the exact toxicity values for a small number of similar chemicals. The read-across method predicts the unknown toxicity of unique compounds using similar chemicals with known toxicity.
In silico methods predict toxicity through computing models, QSARs, and algorithms with toxicity data. They use existing data derived from molecular structures to predict the toxicity and biological activities of a drug.
The QSAR model relies on ML tools and chemical datasets to predict toxicity using molecular structures. Computational algorithms in ML build classification or regression models describing complex relationships between the chemical structure of a drug and its toxicity endpoints based on existing data. Algorithms are categorized into supervised learning, referring to a classification problem, or unsupervised learning, which refers to a clustering problem. Artificial neural networks mimic the structure and function of biological neural networks and can predict specific sequences of proteins and QSAR of a drug.
Limitations to current in silico tools:
Due to inherent bias found in the training datasets supporting powerful computational and data-driven methods, the sensitivity of results is limited to inherent biochemical properties. Thus, automated training data and quantum computers are being used in generative adversarial networks (GANs) to synthesize compounds rather than simply distinguishing them as in traditional ML methods. However, the training datasets in the current network-based ML methods are too general and difficult to interpret complex protein structures from.
How complex can the molecules be for toxicology predictions?
Proteins interact to form biological complexes involved in toxicity. In drug development, the mapping of several interactions, including protein-protein interactions (PPIs), relies on molecular interaction networks. However, systems such as deep learning on graph neural networks (GNNs) and graph convolutional networks (GCNs) fail to determine the three-dimensional shape of protein complexes. Thus, Google’s new DeepMind AlphaFold system has been hailed as a solution to the 50-year-old protein unfolding grand challenge, as it can successfully predict the 3D shape of a protein. Recently, it predicted several protein structures of the coronavirus with a high degree of accuracy.
Looking at molecular interactions as a proxy for toxicology:
Using advanced deep learning networks coupled with multiple sequence alignment such as AlphaFold can generate structures of protein-pair complexes and predict their toxicity. However, the application of deep learning in predicting heteromeric protein complexes involving antibodies has not previously been well explored.
With the DeepComplex automated web server, it is now possible to predict protein complexes for protein dimers of any organism. In addition, the two French companies Mabsilico and OSE Immunotherapeutics within weeks of each other discovered therapeutic monoclonal antibodies using deeptech-driven solutions using AI/ML. The model has been deployed to more quickly develop antibody drug candidates against coronavirus disease.
Conclusion:
In silico methods are steadily providing economic and ethical benefits as an alternative to the traditional animal models in drug development. Combining in silico and ML methods is effective for the simple assessment of the toxic effects of a drug, while complex molecular interactions require advanced deep learning networks with multiple sequence alignment such as the new AlphaFold system that accurately elucidates the 3D structure of proteins. Antibodies involved in toxicity-related complexes can be discovered using deeptech AI/ML solutions by Mabsilico and OSE Immunotherapeutics.
Regulations such as Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) in Europe and the Frank R. Lautenberg Chemical Safety for the 21st Century Act in the United States promote in silico methods across all health sectors, further highlighting the extensive economic and ethical contributions of in silico tools in promoting drug safety.
If you have any questions or would like to know if we can help your business with its innovation challenges, please leave your info here or contact Jeremy Schmerer, Healthcare & Life Sciences Lead, directly at jschmerer@prescouter.com or Linda Cohen, Strategic Accounts Manager at lcohen@prescouter.com.