Notice: I decide to maintain all the programs and
source code on Google Code, due to my upcoming relocation. This page
will be left as it is and will not be updated. You will be redirected to the
Google Code page for this project in 10 seconds...
Click here to go to the
Google Code page now:
GEsnpx: A genetic ensemble approach for gene-gene interaction identification
GEsnpx is an implementation of a
hybrid algorithm developed for gene-gene interaction identification in complex
diseases. The system utilizes a multiple objective genetic
algorithm with an ensemble of 5 nonlinear classifiers to capture gene-gene
interaction through SNP markers. SNP subsets are evaluated and selected in a
combinatorial manner, and potential interactions are identified by a
combinatorial ranking procedure.
Current version supports case-control designed association
study. Besides its comparable detection power for SNP pair (two SNP interaction)
to many other state-of-the-art programs, the parallel support of GEsnpx for
higher-order gene-gene interaction identification set it aside from the single or pairwise based SNP screening algorithms. Please refer to reference  for more
details on implementation and evaluation of GEsnpx.
Note that in current implementation, we have modified the classifier evaluation method using Area Under ROC Curve (AUC)
to address the imbalanced case-control dataset. This may result in a longer computational time depending on the type of machines you are using.
A random over sampling procedure is added to address the same problem when
case-control ratio is highly imbalanced (need to be specified explicitly to use
it). We expect those changes to increase the detection power when
the data is imbalanced.
A new diversity measure "kappa diversity" is implemented and used as default. The original
"double fault diversity" can still be used by specifying through options.
To test the program verbosely, use the verbose option "-v" as follows:
- GEsnpx 1.1
- test dataset1 [20 SNPs]
- test dataset2 [100 SNPs]
- as requested by many people, the source code is now available
academic (non-commercial) users [download]
The test dataset1 and test dataset2 are obtained from study .
* Java 5.0 is required for executing the program.
To obtain the general information about the program, run following command in command line (without parameters):
java -jar GEsnpx.jar
To test the program, run the program with the example dataset as follows:
java -jar GEsnpx.jar -f
java -jar GEsnpx.jar -f
We welcome any help on improving the quality of the software. To report bugs, please email to following address:
 Pengyi Yang, Joshua W.K. Ho,
Albert Y. Zomaya, Bing B. Zhou, "A genetic ensemble approach for gene-gene interaction
identification", BMC Bioinformatics, 2010, 11:524. [fulltext]
 Jason H. Moore et al., "Application of genetic algorithms to the
discovery of complex models for simulation studies in human genetics", In:
Proceedings of the Genetic and Evolutionary Computation Conference,