Some stage of review
Published, accepted, or in press (since ~2001)
- Topological Summaries
of Tumor Images Improve Prediction of Disease Free Survival in
and assessment of fully automated and globally transitive geometric
morphometric methods, with application to a biological comparative
dataset with high interspecific variation
Geometry of Synchronization Problems and Learning Group Actions
HOMINID: A framework for identifying associations between
host genetic variation and microbiome composition
Differential Expression Analysis for RNAseq using Poisson Mixed Models
A phylogenetic transform enhances analysis of compositional
Detecting Epistasis in Genome-wide Association Studies with the
Marginal EPIstasis Test.
Fast moment estimation for generalized latent Dirichlet models.
- Approximations of Markov
Chains and High-Dimensional Bayesian Inference.
- Bayesian Approximate
Kernel Regression with Variable Selection.
- Adaptive Randomized
Dimension Reduction on Massive Data.
Learning Subspaces of Different Dimension.
Sufficient statistics for shapes and surfaces.
Randomized Algorithms for Dimension Reduction on Massive Data.
Towards stratification learning through homology inference.
Multiscale factor models for molecular networks.
Efficient genome-wide sequencing and low coverage pedigree
analysis from non-invasively collected samples. (2016), Genetics.
- Fast principal components
analysis reveals independent evolution of ADH1B gene in Europe and
East Asia. (2016), American Journal of Human Genetics.
Random Walks on Simplicial Complexes and Harmonics. (2016),
Random Structures and Algorithms.
- Geometric representations of random hyper-graphs. (2016), JASA.
- Topological Consistency
via Kernel Estimation. (2016), Bernoulli.
- Bayesian group latent
factor analysis with structured spares priors. (2016), JMLR.
Statistical inference for dynamical systems: a review. (2015),
- Contour Trees of Uncertain
Terrains. (2015), ACM SIGSPATIAL in GIS 2015.
- Citizen Science as a New Tool in Dog Cognition Research. (2015), PLoS One.
Probabilistic Frechet Means and Statistics on Vineyards. (2015),
Electronic Journal of Statistics.
The information geometry of mirror descent. (2015), IEEE Trans. on
Consistency of maximum likelihood estimation for some dynamical
systems. (2015), Annals of Statistics.
The Topology of Probability Distributions on Manifolds. (2015),
Probability Theory and Related Fields.
Cumulon: Cloud-Based Statistical Analysis from Users Perspective.
(2014), IEEE Data Eng. Bull.
- Core and region-enriched networks of behaviorally regulated genes
and the singing genome. (2014), Science.
- Persistent Homology Transform for Modeling Shapes and
Surfaces. (2014), Information and Inference: A Journal of the IMA.
- A new fully automated approach for aligning and comparing
shapes. (2014), Anatomical Records.
Novel Distal eQTL Analysis Demonstrates Effect of Population Genetic
Architecture on Detecting and Interpreting Associations. (2014),
GSAASeqSP: A Toolset for Gene Set Association Analysis of
RNA-Seq Data. (2014), Scientific Reports.
A Cheeger-Type Inequality on Simplicial Complexes. (2014),
Advances in Applied Mathematics.
Frechet Means for Distributions of Persistence Diagrams.
(2014), Discrete and Computational Geometry.
A Digital Network Approach to Infer Sex Behavior in Emerging HIV Epidemics.
(2014), PLoS One.
Statistical Analysis of Crystallization Database Links Protein
Physico-Chemical Features with Crystallization Mechanisms.
(2014), PLoS One.
Distinct and Overlapping Sarcoma Subtypes Initiated from Muscle Stem and Progenitor Cells.
(2013), Cell Reports.
Genome-wide identification and predictive modeling of tissue-specific
A comparative study of covariance selection models for the inference
of gene regulatory networks.
(2013), Journal of Medical Bioinformatics.
- DNase-seq predicts regions of rotational nucleosome stability across diverse human cell types
(2013), Genome Research.
- Sustained-input switches for transcription factors and microRNAs are central building blocks of eukaryotic gene circuits.
(2013), Genome Biology.
- Partial factor modeling: predictor-dependent shrinkage for linear
regression. (2013), Journal of the American Statistical Association.
Sliced Inverse Regression: Regularization and Consistency. (2013), Abstract and Applied Analysis.
- Assessing the
radiation response of lung cancer with different gene mutations
using genetically engineered mice. (2013), Frontiers in Oncology.
Dissecting High-Dimensional Phenotypes with Bayesian Sparse Factor Analysis of Genetic Covariance Matrices.
Genetics of gene expression responses to temperature stress in a
sea urchin gene network. (2012), Molecular Ecology.
Predictive Framework for Integrating Disparate Genomic Data Types
Using Sample-Specific Gene Set Enrichment Analysis and Multi-Task
Learning . (2012), PLoS One.
- Genetic effects on mating
success and partner choice in a social mammal . (2012), American Naturalist.
Cyclin-Dependent Kinases Are Regulators and Effectors of Oscillations
Driven by a Transcription Factor Network. (2012), Molecular Cell.
Homology Transfer and Stratification Learning. (2012),
ACM-SIAM Symposium on Discrete Algorithms.
Probability measures on the space of persistence diagrams. (2012),
Integrating genetic and gene expression evidence into genome-wide
association analysis of gene sets. (2012), Genome Research.
RS-SNP: a random-set method for genome-wide association studies.
(2011), BMC Genomics.
Discovering genetic variants in Crohn's disease by exploring genomic regions enriched of weak association signals.
(2011), Digestive and Liver Disease.
Cross Species Genomic Analysis Identifies a Mouse Model as Undifferentiated Pleomorphic Sarcoma/Malignant Fibrous Histiocytoma.
(2011), PLoS One.
Estimating variable structure and dependence in Multi-task learning
via gradients. (2011), Machine Learning.
- Multiscale factor models for molecular networks. (2011), Proc of JSM.
On the reproducibility of results of pathway analysis in genome-wide
expression studies of colorectal cancers. (2010), Journal of Biomedical Informatics.
Localized Sliced Inverse Regression. (2010), Journal of
Computational and Graphical Statistics.
Learning gradients: predictive models that infer geometry and
dependence. (2010), Journal of Machine Learning Research.
Bayesian mixture of inverse regressions. (2010), International
Conference on Artificial Intelligence and Statistics.
- Learning Gradients and
Feature Selection on Manifolds. (2010), Bernoulli.
motif identification. (2010), Genome Biology.
Comparative study of gene set enrichment methods. (2009), BMC Bionformatics.
features that predict allelic imbalance in humans suggest patterns
of constraint on gene expression variation. (2009) Molelcular
Biology and Evolution.
serum biomarkers really measure breast cancer?. (2009), BMC Cancer.
the developmental pathways TTF-1, NKX2-8, and PAX9 in lung
cancer. (2009), Proc. Natl. Acad. Sci. USA.
sliced inverse regression. (2009), Proceedings of Advances in Neural
Information Processing Systems.
cancer progression via pathway dependencies. (2008), PLoS Comput Biol.
Expression Programs of Human Smooth Muscle Cells: Tissue-Specific
Differentiation and Prognostic Significance in Breast Cancers.
(2007), PLoS Genetics.
- Understanding the use of
unlabelled data in predictive modelling. (2007), Statistical Science.
the Function Space for Bayesian Kernel Models. (2007), J Mach Learn Res.
Genomic sweeping for hypermethylated genes (2007), Bioinformatics.
of influence of genomic DNA sequence on human X chromosome
inactivation. (2006), PLoS Comput Biol.
- Analysis of Sample Set Enrichment
Scores: assaying the enrichment of sets of genes for individual
samples in genome-wide expression profiles. (2006), Bioinformatics.
expression changes and moelcular pathways mediating
activity-dependent plasticity in visual cortex. (2006), Nat Neurosci.
Estimation of Gradients and Coordinate Covariation in
Classification. (2006), J Mach Learn Res.
Learning Coordinate Covariances via Gradients. (2006), J Mach Learn Res..
- Statistical Learning: Stability
is Sufficient for Generalization and Necessary and Sufficient for
Consistency of Empirical Risk Minimization. (2006), Adv Comput Math.
- Gene Set
Enrichment Analysis: A Knowledge-Based Approach for Interpreting
Genome-wide Expression Profiles (2005), Proc Natl Acad Sci USA.
An oncogenic KRAS2 expression signature identified by cross-species
gene-expression analysis (2005), Nat Genet.
- Stability Results in Learning Theory (2005), Anal App.
- Permutation Tests for
Classification (2005), Proceedings of the Conference on Learning Theory.
Risk Bounds for Mixture Density Estimation (2005),
ESAIM: Probability and Statistics.
- Gene Selection via a Spectral
Approach (2005), IEEE Workshop on Computer Vision Methods for Bioinformatics.
- Androgen-Induced Differentiation and Tumorigenicity of Human Prostate
Epithelial Cells. (2004), Cancer Research.
Theory: general conditions for predictivity. (2004), Nature.
Estimating Dataset Size Requirements for Classifying DNA
Microarray Data. (2003), J Comput Biol.
Analytical Method for Multi-class Molecular Cancer Classification. (2003), SIAM Reviews.
Optimal gene expression analysis by microarrays. (2002), Cancer Cell.
Gene Expression-Based Classification and Outcome Prediction of
Central Nervous System Embryonal Tumors. (2002), Nature.
- Choosing Multiple Parameters for
Support Vector Machines. (2002), Machine Learning.
A Uniform Approach to Molecular Cancer Diagnosis Using Tumor
Gene Expression Signatures. (2001), Proc Natl Acad Sci U S A.
Molecular classification of multiple tumor types. (2001), Bioinformatics.
Bounds on sample size for policy evaluation in Markov
environments. (2001), Proceedings of the Conference on Learning Theory.
- Feature Selection for SVMs. J Weston,
S Mukherjee, O Chapelle, M Pontil, T Poggio, V Vapnik. Proc Neural Information Processing Systems.
- Classifying Microarray Data Using
Support Vector Machines. Understanding and Using Microarray Analysis Techniques: A Practical Guide.
- Regression and Classification with
Regularization. Nonlinear Estimation and Classification.
- b Uncertainty in Geometric Computations.
Statistical learning thoery lecture notes, random notes.
Non-parametric Bayesian kernel models, Working Paper.
- Support Vector Method for Multivariate Density
Estimation, CBCL/AI Memo.
- Support Vector Machine Classification of Microarray Data, CBCL/AI Memo.