※ pFunK INTRODUCTION:
Lysine β-hydroxybutyrylation (Kbhb) is a post-translational modification (PTM) induced by the ketogenic diet (KD), a diet that shows therapeutic effects for multiple human diseases. This modification plays a significant role in regulating cellular functions and responses. In 2016, Xie et al. identified 44 histone Kbhb sites, including a critical site H3K9bhb, which is involved in epigenetically regulating gene expression in various tissues such as the liver, pancreas, small intestine, and CD8+ T cells. Despite its importance, only a few Kbhb sites have been reported to be functionally important, highlighting the need for effective prediction methods. The prediction of functionally important Kbhb sites is crucial for understanding their roles in biological processes and disease mechanisms.
To address this, we developed a computational framework named pFunK (prediction of functionally important lysine modification sites). The implementation of pFunK comprised three steps: First, a pre-training model, pFunK-P, was implemented using state-of-the-art (SOTA) deep learning technology, transformer, with a large training data set of 145,657 non-redundant sites of 29 types of lysine modifications to learn the “in-context” information in short sequences around lysine modification sites. In the second step, pFunK-T, the transformer-based model was fine-tuned to further learn the Kbhb characteristics by transfer learning. Besides the transformer, we also integrated 10 types of sequence and structural features. The model was further fine-tuned by Model-Agnostic Meta-Learning (MAML), a widely used few-shot learning algorithm. For this step, we integrated 6,318 reported and 5,304 identified Kbhb sites, obtaining 10,264 non-redundant known Kbhb sites. After homologous elimination, the remaining 6,932 Kbhb sites were used as the benchmark data set for transfer learning and MAML fine-tuning. Finally, MAML was adopted to capture the functional relevance of Kbhb, using only 9 functionally important Kbhb sites for fine-tuning. Our model demonstrated superior performance and is applicable to other acylation modifications such as Kcr, Kac, and Kla.
The pFunK is freely available for academic research at: http://pFunK.biocuckoo.cn/.
![]() |
For publication of results please cite the following article: ![]() JunHong Qin, Xinhe Huang, Shengsong Gou, Sitao Zhang, Yujie Gou, Qian Zhang, Hongyu Chen, Lin Sun, Miaomiao Chen, Dan Liu, Guanjun Gao, Cheng Han, Min Tang, Zihao Feng, Shenghui Niu, Lin Zhao, Yingfeng Tu, Zexian Liu, Weimin Xuan, Lunzhi Dai, Da Jia*, Yu Xue*. 2022, Submitted
|
![]() |