We have introduced a systematic, generalized method, called iSOM-GSN, used to transform “multi-omic” data with higher dimensions onto a two-dimensional grid. Afterwards, we apply a convolutional neural network to predict disease states of various types. Based on the idea of Kohonen’s self-organizing map, we generate a two-dimensional grid for each sample for a given set of genes that represent a gene similarity network. The scheme not only achieves nearly perfect classification accuracy, but also provides an enhanced scheme for representation learning, visualization, dimensionality reduction, and interpretation of multi-omic data.
Zseq, an approach for filtering the reads produced by high throughput sequencing technologies. It's a linear method that identifies the most informative genomic sequences and reduces the number of biased sequences, low complex regions sequences and ambiguous nucleotides.Filtering reads produced by sequencing technologies is necessary before secondary level data processing takes place. It is a usual task in bioinformatics. There are a lot of known sources of bias that come from library preparation, the chosen sequencing technology, or possible contamination. Filtering the reads helps improve the mapping rate for mapper/aligners by removing confusing low information reads that can be mapped to different locations, and also helps improve the performance of the alignment process and the quality of the assembled transcripts.
Zseq takes the reads in .fastq file format as an input. It is freely available at the following link: http://sourceforge.net/p/zseq/wiki/Home/. When using these tools, please cite A. Alkhateeb, S. Reddy, I. Rezaeian, L. Rueda, "Zseq: an approach for filtering low complex and biased sequences in next generation sequencing data", Advanced in Bioinformatics and Artificial Intelligence: Bridging the Gap (IJCAI-BAI 2015), Buenos Aires, Argentina, 2015.
Constrained Multi-level Thresholding is another tool that we have delveloped for finding enriched regions and binding sites in ChIP-Seq data. The software tool can be donwloaded from this link. When using these tools, please cite: CMT: A Constrained Multi-level Thresholding Approach for ChIP-Seq Data Analysis", PLOS ONE, 2014. Accepted - In Press.
Optimal Multi-level Thresholding Gridding. A gridding method for gridding cDNA microarray images based on the principles of segmentation techniques for image processing. This method allows finding sub-grids in full cDNA microarray images, and the location of individual spots in each sub-grid. It works automatically and free of user-defined parameters. It has been found to outperform state-of-the-art methods. The tools can be downloaded from this link. When using these tools, please cite: L. Rueda, I. Rezaeian, A Fully Automatic Gridding Method for cDNA Microarray Images, BMC Bioinformatics, 2011, 12:113.
Datasets of my published papers are available upon request. Some links are given below.