The khmer software package: enabling efficient nucleotide sequence analysis [version 1; referees: 2 approved, 1 approved with reservations]

Michael R. Crusoe; Hussien F. Alameldin; Sherine Awad; Elmar Boucher; Adam Caldwell; Reed Cartwright; Amanda Charbonneau; Bede Constantinides; Greg Edvenson; Scott Fay; Jacob Fenton; Thomas Fenzl; Jordan Fish; Leonor Garcia-Gutierrez; Phillip Garland; Jonathan Gluck; Iván González; Sarah Guermond; Jiarong Guo; Aditi Gupta; Joshua R. Herr; Adina Howe; Alex Hyer; Andreas Härpfer; Luiz Irber; Rhys Kidd; David Lin; Justin Lippi; Tamer Mansour; Pamela McA'Nulty; Eric McDonald; Jessica Mizzi; Kevin D. Murray; Joshua R. Nahum; Kaben Nanlohy; Alexander Johan Nederbragt; Humberto Ortiz-Zuazaga; Jeramia Ory; Jason Pell; Charles Pepe-Ranney; Zachary N. Russ; Erich Schwarz; Camille Scott; Josiah Seaman; Scott Sievert; Jared Simpson; Connor T. Skennerton; James Spencer; Ramakrishnan Srinivasan; Daniel Standage; James A. Stapleton; Susan R. Steinman; Joe Stein; Benjamin Taylor; Will Trimble; Heather L. Wiencko; Michael Wright; Brian Wyss; Qingpeng Zhang; en zyme; C. Titus Brown

doi:10.12688/f1000research.6924.1

F1000Research (Sep 2015)

The khmer software package: enabling efficient nucleotide sequence analysis [version 1; referees: 2 approved, 1 approved with reservations]

Michael R. Crusoe,
Hussien F. Alameldin,
Sherine Awad,
Elmar Boucher,
Adam Caldwell,
Reed Cartwright,
Amanda Charbonneau,
Bede Constantinides,
Greg Edvenson,
Scott Fay,
Jacob Fenton,
Thomas Fenzl,
Jordan Fish,
Leonor Garcia-Gutierrez,
Phillip Garland,
Jonathan Gluck,
Iván González,
Sarah Guermond,
Jiarong Guo,
Aditi Gupta,
Joshua R. Herr,
Adina Howe,
Alex Hyer,
Andreas Härpfer,
Luiz Irber,
Rhys Kidd,
David Lin,
Justin Lippi,
Tamer Mansour,
Pamela McA'Nulty,
Eric McDonald,
Jessica Mizzi,
Kevin D. Murray,
Joshua R. Nahum,
Kaben Nanlohy,
Alexander Johan Nederbragt,
Humberto Ortiz-Zuazaga,
Jeramia Ory,
Jason Pell,
Charles Pepe-Ranney,
Zachary N. Russ,
Erich Schwarz,
Camille Scott,
Josiah Seaman,
Scott Sievert,
Jared Simpson,
Connor T. Skennerton,
James Spencer,
Ramakrishnan Srinivasan,
Daniel Standage,
James A. Stapleton,
Susan R. Steinman,
Joe Stein,
Benjamin Taylor,
Will Trimble,
Heather L. Wiencko,
Michael Wright,
Brian Wyss,
Qingpeng Zhang,
en zyme,
C. Titus Brown

Affiliations

Michael R. Crusoe: Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA
Hussien F. Alameldin: Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI, USA
Sherine Awad: Population Health and Reproduction, University of California, Davis, Davis, CA, USA
Elmar Boucher: Department of Biomedical Engineering, Oregon Health and Science University, Portland, OR, USA
Adam Caldwell: Biology Department, San Jose State University, San Jose, CA, USA
Reed Cartwright: School of Life Sciences and The Biodesign Institute, Arizona State University, Tempe, AZ, USA
Amanda Charbonneau: Genetics, Michigan State University, East Lansing, MI, USA
Bede Constantinides: Computational and Evolutionary Biology, Faculty of Life Sciences, University of Manchester, Manchester, UK
Greg Edvenson: Micron Technology, Seattle, WA, USA
Scott Fay: Invitae, San Francisco, CA, USA
Jacob Fenton: Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
Thomas Fenzl: Independent Researcher, Munich, Germany
Jordan Fish: Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
Leonor Garcia-Gutierrez: Mathematics Institute, University of Warwick, Warwick, UK
Phillip Garland: Eastlake Data, Seattle, WA, USA
Jonathan Gluck: Graduate Program, University of Maryland, College Park, MD, USA
Iván González: Athinoula A. Martinos Center for Biomedical Imaging, Department of Radiology, Massachusetts General Hospital, Charlestown, MA, USA
Sarah Guermond: Independent Researcher, Seattle, WA, USA
Jiarong Guo: Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA
Aditi Gupta: Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA
Joshua R. Herr: Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA
Adina Howe: Department of Agricultural and Biosystems Engineering, Iowa State University, Ames, IA, USA
Alex Hyer: Department of Biology, University of Utah, Salt Lake City, UT, USA
Andreas Härpfer: ConSol* Software GmbH, Munchen, Germany
Luiz Irber: Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
Rhys Kidd: Independent Researcher, Sydney, Australia
David Lin: Verdematics, Fremont, CA, USA
Justin Lippi: Independent Researcher, San Francisco, CA, USA
Tamer Mansour: Population Health and Reproduction, University of California, Davis, Davis, CA, USA
Pamela McA'Nulty: Addgene, Cambridge, MA, USA
Eric McDonald: Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
Jessica Mizzi: Biochemistry and Molecular Biology, Michigan State University, East Lansing, MI, USA
Kevin D. Murray: ARC Centre of Excellence in Plant Energy Biology, The Australian National University, Canberra, ACT, Australia
Joshua R. Nahum: BEACON Center, Michigan State University, East Lansing, MI, USA
Kaben Nanlohy: Independent Researcher, New Orleans, LA, USA
Alexander Johan Nederbragt: Centre for Ecological and Evolutionary Synthesis, Dept. of Biosciences, University of Oslo, Oslo, Norway
Humberto Ortiz-Zuazaga: Department of Computer Science, Rio Piedras Campus, University of Puerto Rico, San Juan, Puerto Rico
Jeramia Ory: Biochemistry, St. Louis College of Pharmacy, St. Louis, MO, USA
Jason Pell: Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
Charles Pepe-Ranney: Crop and Soil Sciences, Cornell University, Ithaca, NY, USA
Zachary N. Russ: Department of Bioengineering, UC Berkeley, Berkeley, CA, USA
Erich Schwarz: Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
Camille Scott: Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
Josiah Seaman: Data Visualization, Newline Technical Innovations, Windsor, CO, USA
Scott Sievert: Electrical and Computer Engineering, University of Minnesota, Minneapolis, MN, USA
Jared Simpson: Ontario Institute for Cancer Research, Toronto, ON, Canada
Connor T. Skennerton: Division of Geological and Planetary Sciences, California Institute of Technology, Pasadena, CA, USA
James Spencer: Dept of Physics and Dept of Materials, Imperial College London, London, UK
Ramakrishnan Srinivasan: Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA
Daniel Standage: Department of Biology, Indiana University, Bloomington, IN, USA
James A. Stapleton: Chemical Engineering & Materials Science, Michigan State University, East Lansing, MIS, USA
Susan R. Steinman: The New York Eye and Ear Infirmary of Mount Sinai, New York, NY, USA
Joe Stein: Independent Researcher, Providence, RI, USA
Benjamin Taylor: Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
Will Trimble: Mathematics and Computer Science Division, Argonne National Laboratory, Lemont, IL, USA
Heather L. Wiencko: Department of Genetics, Smurfit Institute, Trinity College Dublin, Dublin, Ireland
Michael Wright: Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
Brian Wyss: Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
Qingpeng Zhang: Computer Science and Engineering, Michigan State University, East Lansing, MI, USA
en zyme: Independent Researcher, Boston, MA, USA
C. Titus Brown: Microbiology and Molecular Genetics, Michigan State University, East Lansing, MI, USA

DOI: https://doi.org/10.12688/f1000research.6924.1
Journal volume & issue: Vol. 4

Abstract

Read online

The khmer package is a freely available software library for working efficiently with fixed length DNA words, or k-mers. khmer provides implementations of a probabilistic k-mer counting data structure, a compressible De Bruijn graph representation, De Bruijn graph partitioning, and digital normalization. khmer is implemented in C++ and Python, and is freely available under the BSD license at https://github.com/dib-lab/khmer/.

Bioinformatics

Published in F1000Research

ISSN: 2046-1402 (Online)
Publisher: F1000 Research Ltd
Country of publisher: United Kingdom
LCC subjects: Medicine; Science
Website: https://f1000research.com

About the journal

Abstract

Keywords