The Explicit Decomposition with Neighborhoods (EDeN) is a decompositional kernel based on the Neighborhood Subgraph Pairwise Distance Kernel (NSPDK) that can be used to induce an explicit feature representation for graphs. This in turn allows the adoption of machine learning algorithm to perform supervised and unsupervised learning task in a scalable way (e.g. using fast stochastic gradient descent methods in classification and approximate neighborhood queries in clustering).
Among the novelties introduced in EDeN is the ability to take in input real vector labels and to process weighted and nested graphs.
A few examples can be found as IPython Notebook at the following GitHub repository: EDeN_examples.
You can install EDeN with pip directly from github.
pip install git+https://github.com/fabriziocosta/EDeN.git --user
Costa, Fabrizio, and Kurt De Grave. "Fast neighborhood subgraph pairwise distance kernel." Proceedings of the 26th International Conference on Machine Learning, 2010. (ref)
K. De Grave, F. Costa, "Molecular Graph Augmentation with Rings and Functional Groups", Journal of Chemical Information and Modeling, 50 (9), pp 1660–1668, 2010. (ref)
Steffen Heyne, Fabrizio Costa, Dominic Rose, and Rolf Backofen,"GraphClust: alignment-free structural clustering of local RNA secondary structures",Bioinformatics, 28 no. 12 pp. i224-i232, 2012. (ref)
Kousik Kundu, Fabrizio Costa, and Rolf Backofen, "A graph kernel approach for alignment-free domain-peptide interaction prediction with an application to human SH3 domains", Bioinformatics, 29 no. 13 pp. i335-i343, 2013. (ref)
P. Frasconi, F. Costa, K. De Grave, L. De Raedt,"kLog: A Language for Logical and Relational Learning with Kernels", Artificial Intelligence, 2014. (ref)
Omer S. Alkhnbashi, Fabrizio Costa, Shiraz A. Shah, Roger A. Garrett, Sita J. Saunders and Rolf Backofen, "CRISPRstrand: Predicting repeat orientations to determine the crRNA-encoding strand at CRISPR loci", ECCB, 13th European Conference on Computational Biology, 2014. (ref)
Videm P., Rose D., Costa F., Backofen R. ,"BlockClust: efficient clustering and classification of non-coding RNAs from short read RNA-seq profiles", Bioinformatics, 2014 Jun 15;30(12):i274-82. (ref)
Daniel Maticzka, Sita J Lange, Fabrizio Costa, Rolf Backofen, "GraphProt: modeling binding preferences of RNA-binding proteins", Genome Biology 2014, 15:R17 (22 January 2014). (ref)
Gianluca Corrado, Toma Tebaldi, Giulio Bertamini, Fabrizio Costa, Alessandro Quattrone, Gabriella Viero and Andrea Passerini, "PTRcombiner: mining combinatorial regulation of gene expression from post-transcriptional interaction maps", BMC Genomics, 2014; 15:304. (ref)
R. Ferrarese, G. R. 4th Harsh, A. K. Yadav, E. Bug, D. Maticzka, W. Reichardt, S. M. Dombrowski, T. E. Miller, A. P. Masilamani, F. Dai, H. Kim, M. Hadler, D. M. Scholtens, I. L. Y. Yu, J. Beck, V. Srinivasasainagendra, F. Costa, N. Baxan, D. Pfeifer, D. V. Elverfeldt, R. Backofen, A. Weyerbrock, C. W. Duarte, X. He, M. Prinz, J. P. Chandler, H. Vogel, A. Chakravarti, J. N. Rich, M. S. Carro, M. Bredel, "Lineage-specific splicing of a brain-enriched alternative exon promotes glioblastoma progression", J Clin Invest, 124 no. 7 pp. 2861-2876, 2014. (ref)