Gene prediction in prokaryotes using EasyGene
Gene prediction in prokayote genomes might not be the most taxing question in current bioinformatics research. However, after a dealing with genome comparisons recently, I realized that many differences between the genomes exist solely in the annotation procedure and not in their sequence. This seems to be pronounced for the genomes that were sequenced several years ago, when gene prediction was more difficult due to lack of reference strains.
Nielsen and Krogh developed Easygene 1.2 (described in Bioinformatics a month ago) that homogenizes the gene predictions for prokaryotic genomes using a fully automated procedure. They discover discrepancies between their methods and the deposited annotation that convincingly hint at errors in the original annotation for many genomes. Typical errors result in many genomes being overannotated: too many small ORFs are considered real genes.
A web server of their results is provided but as new genomes appear in the dozens each month, I wish they would provide the code for a stand alone installation too. Anyway, good to know that they take care of such seemingly dull but important work.
Nielsen and Krogh developed Easygene 1.2 (described in Bioinformatics a month ago) that homogenizes the gene predictions for prokaryotic genomes using a fully automated procedure. They discover discrepancies between their methods and the deposited annotation that convincingly hint at errors in the original annotation for many genomes. Typical errors result in many genomes being overannotated: too many small ORFs are considered real genes.
A web server of their results is provided but as new genomes appear in the dozens each month, I wish they would provide the code for a stand alone installation too. Anyway, good to know that they take care of such seemingly dull but important work.
spitshine - 2005-12-23 10:21