CNVkit: Genome-wide copy number from high-throughput sequencing¶
- Source code:
- License:
- Packages:
- Article:
- Q&A:
- Consulting:
Contact DNAnexus Science
CNVkit is a Python library and command-line software toolkit to infer and visualize copy number from high-throughput DNA sequencing data. It is designed for use with hybrid capture, including both whole-exome and custom target panels, and short-read sequencing platforms such as Illumina and Ion Torrent.
Requirements: Python 3.10 or later. The R statistical environment with the DNAcopy package is also required for the default CBS segmentation algorithm.
Citation¶
If you use this software in a publication, please cite our paper describing CNVkit:
Talevich, E., Shain, A.H., Botton, T., & Bastian, B.C. (2014). CNVkit: Genome-wide copy number detection and visualization from targeted sequencing. PLOS Computational Biology 12(4):e1004873
Also please cite the supporting paper for the segmentation method you use:
- PSCBS and DNAcopy (
cbs, the default): Olshen, A.B., Bengtsson, H., Neuvial, P., Spellman, P.T., Olshen, R.A., & Seshan, V.E. (2011). Parent-specific copy number in paired tumor-normal studies using circular binary segmentation. Bioinformatics 27(15):2038–46.
Venkatraman, E.S., & Olshen, A.B. (2007). A faster circular binary segmentation algorithm for the analysis of array CGH data. Bioinformatics 23(6):657–63
- HaarSeg (
haar): Ben-Yaacov, E., & Eldar, Y.C. (2008). A fast and flexible method for the segmentation of aCGH data. Bioinformatics 24(16):i139-45.
- pomegranate (HMM segmentation methods):
Schreiber, J. (2018). pomegranate: Fast and Flexible Probabilistic Modeling in Python. Journal of Machine Learning Research 18(164):1−6.