Revisit Target-oriented Fragment Linking: SyntaLinker-Hybrid

Now the time to revisit fragment-based drug design (FBDD) in the age of artificial intelligence (AI). The popularity of FBDD has expanded over the past decade owing to the aid of the computational methodology development, the computer spec enhancement and the improved experimental environment.

In spite of the world-wide, intensive research by both industry and academia, FBDD has not matured into the crucial breakthrough deal. Quantum mechanical calculations1),2) and database search-based approaches3),4) had limited success probably due to the difficulty in big data calculation and huge database search to figure out the chemical structures for better activity and PK/PD properties.

In this respect, AI-based FBDD would give us a chance to target-oriented drug design through the integration of chemical fragments.

Here is an interesting paper reporting SyntaLinker-Hybrid5), which is an improved version of SyntaLinker6) for the purpose of target-based drug design. SyntaLinker adopts sentence completion-based conditional transformer to generate fragment linking and the linkers with the constraints of the number of hydrogen-bond donors/acceptors, the shortest linker bond distance and so on.

SyntaLinker-Hybrid takes transfer learning of the SyntaLinker model and fragment hybridization for virtual library for a specific target. Basically, SyntaLinker-Hybrid utilizes a trained model by SyntaLinker in a target-unspecific manner so that a large number of fragments and linkers from ChEMBL are incorporated at the training stage. By transfer learning, there is a more reasonable opportunity to generate potentially active molecule with a small set of target-specific data.

SyntaLinker-Hybrid needs a dataset of the active molecule structures on the target protein to build up the optimized model for each target and to conduct fragment hybridization. Fragment hybridization is a pairing of two terminal fragments from different active molecules, which is utilized for model sampling.

The author demonstrated its performance and accuracy against four targets (BRD4, FXR, HDAC1, DRD2). According to the paper, 714 to 4311 active-molecule datasets were used for model construction. The evaluation was conducted through the coverage of chemical space by the generated molecules against the dataset and Kernel-density estimation (KDE) analysis for statistical probability distribution of docking scores and shape similarity.

The chemical space and KDE distribution has good similarity to the active molecules with statistically reasonable spreading. It indicated that SyntaLinker-Hybrid could be a useful tool to revisit FBDD to generate a set of molecules to synthetize at the lead optimization stage.

SyntaLinker-Hybrid is one emerged tool for FBDD and more reliable models would potentially appear in a few years since the advance in the AI field is significantly fast. We would like to share the updates on this blog. The chances are expanding and we’d be pleased if you would try this technology or another with us.


Scroll to top