ClustScan and CompGen were used to structure a propriety database r-CSDB (the recombinant-ClustScan DataBase) of predicted, entirely novel recombinant products that can be used for in silico screening with the computer aided drug design technologies. Like CSDB, r-CSDB also contains all data starting with gene cluster recombinant DNA sequence, the DNA and protein sequences of genes, modules, domains and corresponding linkers and dockers of recombinant gene clusters. It also contains all known polyketide and peptide building blocks in the form of isomeric SMILES, along with the programmed logic that allows prediction of linear and cyclic polyketide and peptide chains and aglycons in the 2-D or 3-D forms suitable for further computer processing. The database is also fully searchable using recombinant gene cluster annotations as well as recombinant compound structures. As CSDB, the r-CSDB data can also be manipulated using a number of conventional bioinformatic tools.

