资 源 简 介
CD-HIT is now available from Github at https://github.com/weizhongli/cdhit.
CD-HIT is a program for clustering DNA/protein sequence database at high identity with tolerance.
References:
CD-HIT: accelerated for clustering the next generation sequencing data, Limin Fu, Beifang Niu, Zhengwei Zhu, Sitao Wu, Weizhong Li, Bioinformatics, (2012) 28(23):3150-2.
Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences, Weizhong Li & Adam Godzik, Bioinformatics, (2006) 22:1658-9. PDF Pubmed
Tolerating some redundancy significantly speeds up clustering of large protein data
文 件 列 表
cd-hit-v4.6.1-2012-08-27
cdhit-common.h
clstr_reduce.pl
README
Makefile
clstr_cut.pl
clstr_rep.pl
clstr_quality_eval_by_link.pl
clstr2tree.pl
clstr_merge_noorder.pl
cd-hit-para.pl
cdhit-utility.c++
clstr_size_histogram.pl
license.txt
ChangeLog
cdhit-div.c++
cd-hit-div.pl
cdhit-2d.c++
cdhit-est-2d.c++
psi-cd-hit-2d-g1.pl
psi-cd-hit-local.pl
clstr_quality_eval.pl
cdhit-common.c++
clstr2txt.pl
clstr_sql_tbl_sort.pl
make_multi_seq.pl
cd-hit-2d-para.pl
clstr_size_stat.pl
plot_2d.pl
plot_len1.pl
clstr_sort_by.pl
cdhit-utility.h
doc
clstr2xml.pl
cdhit.c++
psi-cd-hit-2d.pl
psi-cd-hit.pl
clstr_reps_faa_rev.pl
clstr_merge.pl
clstr_sql_tbl.pl
clstr_select.pl
clstr_select_rep.pl
clstr_rev.pl
cdhit-est.c++
cdhit-454.c++
clstr_renumber.pl
clstr_sort_prot_by.pl