Machine learning-based prediction models to guide the selection of Cas9 variants for efficient gene editing
Jianbo Li,
Panfeng Wu,
Zhoutao Cao,
Guanlan Huang,
Zhike Lu,
Jianfeng Yan,
Heng Zhang,
Yangfan Zhou,
Rong Liu,
Hui Chen,
Lijia Ma,
Mengcheng Luo
Affiliations
Jianbo Li
Hubei Provincial Key Laboratory of Developmentally Originated Disease, TaiKang Center for Life and Medical Sciences, School of Basic Medical Sciences, Wuhan University, Wuhan 430072, China; AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China; Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China
Panfeng Wu
Hubei Provincial Key Laboratory of Developmentally Originated Disease, TaiKang Center for Life and Medical Sciences, School of Basic Medical Sciences, Wuhan University, Wuhan 430072, China; AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China; Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China
Zhoutao Cao
AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China
Guanlan Huang
AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China
Zhike Lu
Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China
Jianfeng Yan
AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China
Heng Zhang
AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China; Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China
Yangfan Zhou
Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China
Rong Liu
Hubei Provincial Key Laboratory of Developmentally Originated Disease, TaiKang Center for Life and Medical Sciences, School of Basic Medical Sciences, Wuhan University, Wuhan 430072, China
Hui Chen
AIdit Therapeutics, 1 Yunmeng Road, Building 1, Hangzhou 310024, Zhejiang, China
Lijia Ma
Westlake Laboratory, School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou 310024, Zhejiang, China; Corresponding author
Mengcheng Luo
Hubei Provincial Key Laboratory of Developmentally Originated Disease, TaiKang Center for Life and Medical Sciences, School of Basic Medical Sciences, Wuhan University, Wuhan 430072, China; Corresponding author
Summary: The increasing emergence of Cas9 variants has attracted broad interest, as these variants were designed to expand CRISPR applications. New Cas9 variants typically feature higher editing efficiency, improved editing specificity, or alternative PAM sequences. To select Cas9 variants and gRNAs for high-fidelity and efficient genome editing, it is crucial to systematically quantify the editing performances of gRNAs and develop prediction models based on high-quality datasets. Using synthetic gRNA-target paired libraries and next-generation sequencing, we compared the activity and specificity of gRNAs of four SpCas9 variants. The nucleotide composition in the PAM-distal region had more influence on the editing efficiency of HiFi Cas9 and LZ3 Cas9. We further developed machine learning models to predict the gRNA efficiency and specificity for the four Cas9 variants. To aid users from broad research areas, the machine learning models for the predictions of gRNA editing efficiency within human genome sites are available on our website.