EC Number Prediction
  • Number of genome sequence projects of various organisms have resulted in generation of a large amount of gene and protein sequence information. The focus is now on the identification and functional characterization of proteins encoded by these genomes.
  • Enzyme Commission number (EC number) is a numerical classification scheme for enzymes, based on the chemical reactions they catalyze.
  • EC numbers represent enzymes and enzyme genes (genomic information), but they are also utilized as identifiers of enzymatic reactions (chemical information).
  • The scheme is a hierarchical organization of enzyme reactions into six main classes i.e. oxidoreductases, transferases, hydrolases, lyases, isomerases and ligases which are then further split at three hierarchical levels
  • Due to the recent efforts of structural genomics initiatives a large and growing number of enzymes have no functional annotation whilst Experimental functional characterization is time-consuming and expensive.
  • High-precision EC number assignment is of utmost importance for studies such as metabolic pathway reconstruction, understanding evolutionary relationships in pathways and metabolite prediction, etc.
  • Hence there is a vital requisite for improved computational techniques for precise prediction and assignment of EC number.
  • Here, we present ECpred: a predictive model to assign EC number to the enzyme with unidentified function using two supervised machine learning approaches k Nearest Neighbor (k-NN) and Probabilistic Neural Network (PNN). The final prediction is made on the basis of a consensus of the predictions made by selected algorithm and a probability is assigned to it.
  • ECpred classifies an enzyme in one of 3349 EC numbers.