Epitopia Overview


Epitopia is a server for studying the immunogenic nature of a protein. Epitopia implements a machine learning scheme to rank individual amino acids in the protein, according to their potential of eliciting a humoral immune response.


The interaction between an antibody and its antigen is at the heart of the humoral immune response. Specific regions of an antigen, termed epitopes, elicit a humoral immune respone, and are thus recognized by antibodies. Based on large datasets of antigen 3D structures and sequences, for which a validated epitope is known, several physico-chemical and structural-gemoetrical properties that significantly distinguish epitopes from the remaining antigen surface have been derived (Rubinstein et al., 2008). A machine learning scheme was then implemented in order to train a Naive Bayes classifier on these data for the purpose of detecting protein regions that manifest epitope-like characteristics.


Epitopia may either be used to detect immunogenic regions in a given protein structure or in a given protein sequence. The input is analyzed with regards to its phyisco-chemical and structural-geometrical properties. Following that, the Naive Bayes classifier, which was trained on a dataset of epitope and non-epitope examples, computes for each property in each residue of the antigen its probability of being an epitope based on the the region it is embedded in. In other words, each epitope-sized region of the protein is given a score that reflects the joint probability of each one of its phyisco-chemical and structural-geometrical properties being an epitope based on validated epitope examples. The joint probability is expressed in sum of log of probabilities and is assigned to the amino-acid in the center of that epitope-sized region to enable inference of the immunogenic potential at the single amino-acid site resolution. Given an immunogenicity score of a residue, the probability that it was drawn from a population of epitope residues is thus computed as the fraction of validated epitope residues among all residues of the training data with an immunogenicity score in that range.



After the the "Submit" button has been pressed a preprocess stage is performed in order to prepare the input and run several stand-alone executables that extract some of the physico-chemical and structural-geometrical properties required for the Epitopia prediction.


For each Epitopia run a "job status page is created" and updated every 30 seconds. When the computation is complete links to the different results appear, and an email is sent to the user, if he entered an email address.


    Atchley WR, Zhao J, Fernandes AD, Druke T. 2005. Solving the protein sequence metric problem. Proc. Natl. Acad. Sci. U.S.A. 102: 6395-6400. [Abstract]

    Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. 2000. The Protein Data Bank. Nucleic Acids Res. 28: 235-242. [Full text]

    Emini EA, Hughes JV, Perlow DS, Boger J. 1985. Induction of hepatitis A virus-neutralizing antibody by a virus-specific synthetic peptide. J Virol.. 55: 836-839. [Full text]

    Grantham R. 1974. Amino acid difference formula to help explain protein evolution. Science. 185: 862-864. [Abstract]

    Hopp TP, Woods KR. 1981. Prediction of protein antigenic determinants from amino acid sequences. Proc Natl Acad Sci U S A.. 78: 3824-3828. [Full text]

    Janin J, Wodak S. 1978. Conformation of amino acid side-chains in proteins. J Mol Biol.. 125: 357-386.

    Kabsch W and Sander C. 1983. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers. 12: 2577-2637.

    Karplus PA and Schulz GE. 1985. Prediction of chain flexibility in proteins: a tool for the selection of peptide antigen. Naturwissenschaften 72: 212-213. [Full text]

    Kolaskar AS and Tongaonkar PC. 1990. A semi-empirical method for prediction of antigenic determinants on protein antigens. FEBS Lett. 276: 172-174. [Abstract]

    Parker JM, Guo D, Hodges RS. 1986. New hydrophilicity scale derived from high-performance liquid chromatography peptide retention data: correlation of predicted surface residues with antigenicity and X-ray-derived accessible sites. Biochemistry 25: 5425-5432. [Abstract]

    Ponnuswamy PK, Prabhakaran M, Manavalan P. 1980. Hydrophobic packing and spatial arrangement of amino acid residues in globular proteins. Biochim. Biophys. Acta. 623: 301-326. [Abstract]

    Rost B, Yachdav G, Liu J. 2004. The PredictProtein server. Nucleic Acids Res. 32: W321-326 [Abstract]

    Rubinstein ND, Mayrose I, Halperin D, Yekutieli D, Gershoni JM, Pupko T. 2008. Computational characterization of B-cell epitopes. Mol. Immunol. 45: 3477-3489. [Full text]

    Rubinstein ND, Mayrose I, Pupko T. 2009. A machine-learning approach for predicting B-cell epitopes. Mol. Immunol. 46: 840-847. [Full text]

    Sayle RA, Milner-White EJ. 1995. RASMOL: biomolecular graphics for all. Trends Biochem Sci. 20:374.

    Tsodikov OV, Record MT, Sergeev YV. 2002. Novel computer program for fast exact calculation of accessible and molecular surface areas and average surface curvature. J Comput Chem. 23: 600-609. [Abstract]

    Page Top