2018 | 2017 | 2016 | 2015 | 2014 | 2013 | 2012 | 2011 | 2010 | 2009 | 2008 | 2007 | 2006 | 2005 | 2004 | 2003 | 2001 | 2000 | 1999 | Matthew's Theses | Student Theses


S Zhao, J Liu, P Nanga, Y Liu, A E Cicek, N Knoblauch, C He, M Stephens and X He. Model-based analysis of positive selection significantly expands the list of cancer driver genes, including RNA methyltransferases. bioRxiv doi:10.1101/366823. accompanying code and data resources

D Gerard and M Stephens. Empirical Bayes shrinkage and false discovery rate estimation, allowing for unwanted variation. To appear in Biostatistics . R package | code used to produce results in paper

Y Kim, P Carbonetto, M Stephens and M Anitescu. A fast algorithm for maximum likelihood estimation of mixture proportions using sequential quadratic programming. arXiv:1806.01412. accompanying code resources | R package

L F V Ferrão, R G Ferrão, M A G Ferrão, A Fonseca, P Carbonetto, M Stephens and A A F Garcia. Accurate genomic prediction of Coffea canephora in multiple environments using whole-genome statistical models. Heredity to appear. data

M C Ward, S Zhao, K Luo, B J Pavlovic, M M Karimi, M Stephens and Y Gilad. Silencing of transposable elements may not be a major driver of regulatory evolution in primate iPSCs. eLife 7: e33084.

D Gerard, L F V Ferrão, A A F Garcia and M Stephens. Harnessing empirical Bayes and Mendelian segregation for genotyping autopolyploids from messy sequencing data. bioRxiv doi:10.1101/281550. R package | code used to produce results in paper

W Wang and M Stephens. Empirical Bayes matrix factorization. arXiv:1802.06931. R package | code used to produce results in paper

J Smith, G Coop, M Stephens and J Novembre. Estimating time to the common ancestor for a beneficial allele. Molecular Biology and Evolution 35(4): 1003–1017.


K Dey, D Xie and M Stephens. A new sequence logo plot to highlight enrichment and depletion. bioRxiv doi:10.1101/226597. R package | code used to produce results in paper

P Carbonetto, X Zhou and, M Stephens. varbvs: fast variable selection for large-scale regression. arXiv:1709.06597. software

X Zhu and M Stephens. A large-scale genome-wide enrichment analysis identifies new trait-associated genes, pathways and tissues across 31 human phenotypes. bioRxiv doi:10.1101/160770. software | online results

D Gerard and, M Stephens. Unifying and generalizing methods for removing unwanted variation based on negative controls. arXiv:1705.08393. R package | code used to produce results in paper

S M Urbut, G Wang and M Stephens. Flexible statistical methods for estimating and testing effects in genomic studies with multiple conditions. bioRxiv doi:10.1101/096552. Software

X Zhu and M Stephens. Bayesian large-scale multiple regression with summary statistics from genome-wide association studies. Annals of Applied Statistics 11(3): 1561-1592. Supplemental materials | Software

I E Schor, J F Degner, D Harnett, E Cannavo, F P Casale, H Shim, D A Garfield, E Birney, M Stephens, O Stegle, and E E M Furlong. Promoter shape varies across populations and affects promoter evolution and expression noise. Nature Genetics 49(4): 550-558. Supplementary Text and Figures

K Dey, C Hsiao, and M Stephens. Clustering RNA-seq expression data using grade of membership models. PLoS Genetics 13(3): e1006599. R package on Github | R package on Bioconductor | code used to produce results in paper

M Stephens. False discovery rates: a new deal. Biostatistics 18 (2): 275-294. Supplementary Materials | R package on CRAN | R package on Github | code used to produce results in paper


Z Xing and M Stephens. Smoothing via Adaptive Shrinkage (smash): denoising Poisson and heteroskedastic Gaussian signals. http://arxiv.org/abs/1605.07787. Software

M Lu and M Stephens. Variance adaptive shrinkage (vash): Flexible empirical Bayes estimation of variances. Bioinformatics 32(22): 3428-3434. Supplementary Materials | Software

A Raj, S Wang, H Shim, A Harpak, Y I Li, B Englemann, M Stephens, Y Gilad, and J K Pritchard. Thousands of novel translated open reading frames in humans inferred by ribosome footprint profiling. eLife 2016;5:e13328.

D Petkova, J Novembre, and M Stephens. Visualizing spatial population structure with estimated effective migration surfaces. Nature Genetics 48: 94-100. Supplementary Text and Figures | Software


S Mondol, I Moltke, J Hart, M Keigwin, L Brown, M Stephens, and S K Wasser. New evidence for hybrid zones of forest and savanna elephants in Central and West Africa. Mol Ecol 24(24): 6134-6147, December 2015.

Y Shiraishi, G Tremmel, S Miyano, and M Stephens. A simple model-based approach to inferring and visualizing cancer mutation signatures. PLoS Genetics 11(12): e1005657, December 2015.

A Raj, H Shim, Y Gilad, J K Pritchard, and M Stephens. msCentipede: Modeling heterogeneity across genomic sites improves accuracy in the inference of transcription factor binding. PLoS ONE 10(9): e0138030. Software

H Shim and M Stephens. Wavelet-based genetic association analysis of functional phenotypes arising from high-throughput sequencing assays. Annals of Applied Statistics 9(2): 665-686. Software

K G Ardlie et al. The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans. Science 348(6235): 648-660, May 2015.

H Shim, D I Chasman, J D Smith, S Mora, P M Ridker, D A Nickerson, R M Krauss, and M Stephens. A multivariate genome-wide association analysis of 10 LDL subfractions, and their response to statin treatment, in 1868 Caucasians. PLoS ONE 10(4): e0120758. Software | Data

Z Gao, D Waggoner, M Stephens, C Ober, and M Przeworski. An estimate of the average number of recessive lethal mutations carried by humans. Genetics 199(4): 1243-1254.

J Tung, X Zhou, S C Alberts, M Stephens, and Y Gilad. The genetic architecture of gene expression levels in wild baboons. eLife 2015;4:e04729.


X Zhou, C E Cain, M Myrthil, N Lewellen, K Michelini, E R Davenport, M Stephens, J K Pritchard, and Y Gilad. Epigenetic modifications are associated with inter-species gene expression variation in primates. Genome Biology 15(12): 547, December 2014.

X Wen and M Stephens. Bayesian methods for genetic association analysis with heterogeneous subgroups: From meta-analyses to gene-environment interactions. Annals of Applied Statistics 8(1): 176-203. Supplementary Text

X Zhou and M Stephens. Efficient multivariate linear mixed model algorithms for genome-wide assocication studies. Nature Methods 11: 407-409. Software

A Raj, M Stephens, and J K Pritchard. fastSTRUCTURE: Variational Inference of Population Structure in Large SNP Data Sets. Genetics 197(2): 573-589, June 2014. Software


L M Mangravite, B E Engelhardt, M W Medina, J D Smith, C D Brown, D I Chasman, B H Mecham, B Howie, H Shim, D Naidoo, Q Feng, M J Rieder, Y D Chen, J I Rotter, P M Ridker, J C Hopewell, S Parish, J Armitage, R Collins, R A Wilke, D A Nickerson, M Stephens, and R M Krauss. A statin-dependent QTL for GATM expression is associated with statin-induced myopathy. Nature 502: 377-380.

P Carbonetto and M Stephens. Integrated enrichment analysis of variants and pathways in genome-wide association studies indicates central role for IL-2 signaling genes in type 1 diabetes, and cytokine signaling genes in Crohn's disease. PLoS Genetics 9(10): e1003770. Software

F Luca, J C Maranville, A L Richards, D B Witonsky, M Stephens, and A Di Rienzo. Genetic, functional and molecular features of glucocorticoid receptor binding. PLoS ONE 8(4): e61654.

M Stephens. A Unified framework for association analysis with multiple related phenotypes. PLoS ONE 8(7): e65245.

T Flutre, X Wen, J Pritchard, and M Stephens. A statistical framework for joint eQTL analysis in multiple tissues. PLoS Genetics 9(5): e1003486. Software

X Zhou, P Carbonetto, and M Stephens. Polygenic modeling with Bayesian sparse linear mixed models. PLoS Genetics 9(2): e1003264. Software


Q Li, J K Eng, and M Stephens. A likelihood-based scoring method for peptide identification using mass spectrometry. Annals of Applied Statistics 6(4): 1775-1794. arXiv

A A Pai, C E Cain, O Mizrahi-Man, S De Leon, N Lewellen, J-B Veyrieras, J F Degner, D J Gaffney, J K Pickrell, M Stephens, J K Pritchard, and Y Gilad. The Contribution of RNA Decay Quantitative Trait Loci to Inter-Individual Variation in Steady-State Gene Expression Levels. PLoS Genetics 8(10), e1003000, October 2012.

A B Hart, B E Engelhardt, M C Wardle, G Sokoloff, M Stephens, H de Wit, and A A Palmer. Genome-Wide Association Study of d-Amphetamine Response in Healthy Volunteers Identifies Putative Associations, Including Cadherin 13 (CDH13). PLoS ONE 7(8): e42646.

J Maranville, F Luca, M Stephens, and A Di Rienzo. Mapping gene-environment interactions at regulatory polymorphisms: insights into mechanisms of phenotypic variation. Transcription 3(2): 56-62.

B Howie, C Fuchsberger, M Stephens, J Marchini, and GR Abecasis. Fast and accurate genotype imputation in genome-wide association studies through pre-phasing. Nature Genetics 44(8): 955-959.

X Zhou and M Stephens. Genome-wide efficient mixed model analysis for association studies. Nature Genetics 44(7): 821-824. Supplementary Text | Software

A Q Fu, D P Genereux, R Stoger, A F Burden, C D Laird, and M Stephens. Statistical inference of in vivo properties of human DNA methyltransferases from double-stranded methylation patterns. PLoS ONE 7(3): e32225. Software

G H Perry, P Melsted, J C Marioni, Y Wang, R Bainer, J K Pickrell, K Michelini, S Zehr, A D Yoder, M Stephens, J K Pritchard, and Y Gilad. Comparative RNA sequencing reveals substantial genetic variation in endangered primates. Genome Biology 22: 602-610, 2011.

P Carbonetto and M Stephens. Scalable variational inference for Bayesian variable selection in regression, and its accuracy in genetic association studies. Bayesian Analysis 7(1): 73-108, March 2012. Software

J-B Veyrieras, D J Gaffney, J K Pickrell, Y Gilad, M Stephens, and J K Pritchard. Exon-specific QTLs skew the inferred distribution of expression QTLs detected using gene expression array data. PLoS One 7(2), e30629, February 2012.

J F Degner, A A Pai, R Pique-Regi, J-B Veyrieras, D J Gaffney, J K Pickrell, S De Leon, K Michelini, N Lewellen, G E Crawford, M Stephens, Y Gilad, and J K Pritchard. DNase I sensitivity QTLs are a major determinant of human expression variation. Nature 482: 390-394, February 2012.

D J Gaffney, J-B Veyrieras, J F Degner, R Pique-Regi, A A Pai, G E Crawford, M Stephens, Y Gilad, and J K Pritchard. Dissecting the regulatory architecture of gene expression QTLs. Genome Biology 13:R7, January 2012.


L E Mechanic, H-S Chen, C I Amos, N Chatterjee, N J Cox, R L Divi, R Fan, E L Harris, K Jacobs, P Kraft, S M Leal, K McAllister, J H Moore, D N Paltoo, M A Province, E M Ramos, M D Ritchie, K Roeder, D J Schaid, M Stephens, D C Thomas, C R Weinberg, J S Witte, S Zhang, S Zollner, E J Feuer, and E M Gillanders. Next generation analytic tools for large scale genetic epidemiology studies of complex diseases. Genetic Epidemiology 36(1): 22-35, December 2011.

B Howie, J Marchini, and M Stephens. Genotype imputation with thousands of genomes. G3: Genes, Genomics, Genetics 1(6): 457-470, November 2011.

Y Guan and M Stephens. Bayesian Variable Selection Regression for Genome-wide Association Studies, and other Large-Scale Problems. Annals of Applied Statistics 5(3): 1780-1815, September 2011.

J C Maranville, F Luca, A L Richards, X Wen, D B Witonsky, S Baxter, M Stephens, and A Di Rienzo. Interactions between Glucocorticoid Treatment and Cis-Regulatory Polymorphisms Contribute to Cellular Response Phenotypes. PLoS Genetics 7(7): e1002162, July 2011.

A Fledel-Alon, E M Leffler, Y Guan, M Stephens, G Coop, and M Przeworski. Variation in Human Recombination Rates and Its Genetic Determinants. PLoS ONE 6(6): e20321, June 2011.


A Q Fu, D P Genereux, R Stoger, C D Laird, and M Stephens. Statistical inference of transmission fidelity of DNA methylation patterns over somatic cell divisions in mammals. Annals of Applied Statistics 4(2): 871-892, June 2010.

Q Li, M MacCoss, and M Stephens. A nested mixture model for protein identification using mass spectrometry. Annals of Applied Statistics 4(2): 962-987, June 2010.

X Wen and M Stephens. Using linear predictors to impute allele frequencies from summary or pooled genotype data. Annals of Applied Statistics 4(3): 1158-1182, September 2010. Software

J Novembre and M Stephens. Response to Cavalli-Sforza interview. Human Biology 82(4): 469-470, August 2010.

L B Barreiro, J C Marioni, R Blekham, M Stephens, and Y Gilad. Functional comparison of innate immune signaling pathways in primates. PLoS Genetics 6(12): e1001249, December 2010.

B E Engelhardt and M Stephens. Analysis of population structure: a unifying framework and novel methods based on sparse factor analysis. PLoS Genetics 6(9): e1001117. Software

J K Pickrell, J C Marioni, A A Pai, J F Degner, B E Engelhardt, E Nkadori, J B Veyrieras, M Stephens, Y Gilad, and J K Pritchard. Understanding mechanisms underlying human gene expression variation with RNA sequencing. Nature 464(7289):768-72, Mar 2010.

M J Barber, L M Mangravite, C L Hyde, D I Chasman, J D Smith, C A McCarty, X Li, R A Wilke, M J Rieder, P T Williams, P M Ridker, A Chatterjee, J I Rotter, D A Nickerson, M Stephens, and R M Krauss. Genome-wide association of lipid-lowering response to statins in combined study populations. PLoS ONE Mar 2010. Supplementary Results Data Page

R Blekhman, J C Marioni, P Zumbo, M Stephens, and Y Gilad. Sex-specific and lineage-specific alternative splicing in primates. Genome Res 20(2):180-9, Feb 2010.


S Basu, M Stephens, J S Pankow, and E A Thompson. A Likelihood-Based Trait-Model-Free Approach for Linkage Detection of Binary Trait. Biometrics 66(1):205-13, Mar 2009.

M J Hubisz, D Falush, M Stephens, and J K Pritchard. Inferring weak population structure with the assistance of sample group information. Molecular Ecology Resources 9:1322-1332, 2009.

M Stephens and D J Balding. Bayesian statistical methods for genetic association studies. Nat Rev Genet 10(10):681-90, Oct 2009.


Y Guan and M Stephens. Practical issues in imputation-based association mapping. PLoS Genet 4(12), Dec 2008.

J Novembre, T Johnson, K Bryc, Z Kutalik, A R Boyko, A Auton, A Indap, K S King, S Bergmann, M R Nelson, M Stephens, and C D Bustamante. Genes mirror geography within Europe. Nature 456(7218):98-101, Nov 2008.

J B Veyrieras, S Kudaravalli, S Y Kim, E T Dermitzakis, Y Gilad, M Stephens, and J K Pritchard. High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet 4(10), Oct 2008.

S K Wasser, W J Clark, O Drori, E S Kisamo, C Mailand, B Mutayoba, and M Stephens. Combating the illegal trade in African elephat ivory with DNA forensics. Conserv Biol 22(4):1065-1071, Aug 2008.

J C Marioni, C E Mason, S M Mane, M Stephens, and Y Gilad. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res 18(9):1509-1517, Sep 2008. Software

P Scheet and M Stephens. Linkage disequilibrium-based quality control for large-scale genetic studies. PLoS Genet 4(8), 2008.

A P Reiner, M J Barber, Y Guan, P M Ridker, L A Lange, D I Chasman, J D Walston, G M Cooper, N S Jenny, M J Rieder, J P Durda, J D Smith, J Novembre, R P Tracy, J I Rotter, M Stephens, D A Nickerson, and R M Krauss. Polymorphisms of the HNF1A gene encoding hepatocyte nuclear factor-1 alpha are accosicated with c-reactive protein. Am J Hum Genet 82(5):1193-1201, May 2008.

J Novembre and M Stephens. Interpreting principal component analyses of spatial population genetic variation. Nat Genet 40(5):646-649, May 2008. Supplementary Materials


The International HapMap Consortium. A second generation human haplotype map of over 3.1 million SNPs. Nature 449(7164):851-61, Oct 2007.

B Servin and M Stephens. Imputation-based analysis of association studies: candidate regions and quantitative traits. PLoS Genet 3(7), Jul 2007. Supplementary File S1 | Supplementary File S2

D Falush, M Stephens, and J K Pritchard. Inference of population structure using multilocus genotype data: dominant markers and null alleles. Molecular Ecology Notes 7:574-578, Jul 2007.

A Roychoudhury and M Stephens. Fast and accurate estimation of the population-scaled mutation rate, theta, from microsatellite genotype data. Genetics 176(2):1363-1366, Jun 2007.

G Hellenthal and M Stephens. msHOT: modifying Hudson's ms simulator to incorporate crossover and gene conversion hotspots. Bioinformatics 23(4):520-521, 2007.

S K Wasser, C Mailand, R Booth, B Mutayoba, E Kisamo, B Clark, and M Stephens. Using DNA to track the origin of the largest ivory seizure since the 1989 trade ban. Proc Natl Acad Sci U S A 104(10):4228-4233, Mar 2007.


T Du Raedt, M Stephens, I Heyns, H Brems, D Thijs, L Messiaen, K Stephens, C Lazaro, K Wimmer, H Kehrer-Sawatzki, D Vidaud, L Kluwe, P Marynen, and E Legius. Conservation of hotspots for recombination in low-copy repeats associated with the NF1 microdeletion. Nat Genet 38(12):1419-1423, 2006.

T R Bhangale, M Stephens, and D A Nickerson. Automating resequencing-based detection of insertion-deletion polymorphisms. Nat Genet 38(12):1457-1462, 2006.

G Hellenthal and M Stephens. Insights into recombination from population genetic variation. Curr Opin Genet Dev 16(6):565-572, 2006.

R Gottardo, J Besag, M Stephens, and A Murua. Probabilistic segmentation and intensity estimation for microarray images. Biostatistics 7(1):85-99, 2006.

G Hellenthal, J K Pritchard, and M Stephens. The effects of genotype-dependent recombination, and transmission asymmetry, on linkage disequilibrium. Genetics 172(3):2001-2005, 2006.

M Stephens, J S Sloan, P D Robertson, P Scheet, and D A Nickerson. Automating sequence-based detection and genotyping of SNPs from diploid samples. Nat Genet 38(3):375-381, 2006.

P Scheet and M Stephens. A fast and flexible method for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet 78(4):629-644, 2006.

J Marchini, D Cutler, N Patterson, M Stephens, E Eskin, E Halperin, S Lin, Z S Qin, H M Munro, G R Abecasis, and P Donnelly. A comparison of phasing algorithms for trios and unrelated individuals. Am J Hum Genet 78(3):437-450, 2006.


The International HapMap Consortium. A haplotype map of the human genome. Nature 237(7063):1299-1320, 2005.

M Stephens and P Scheet. Accounting for decay of linkage disequilibrium in haplotype inference and missing-data imputation. Am J Hum Genet 76(3):449-462, 2005.


D C Crawford, T Bhangale, N Li, G Hellenthal, M J Rieder, D A Nickerson, and M Stephens. Evidence for substantial fine-scale variation in recombination rates across the human genome. Nat Genet 36(7):700-706, 2004.

S E Ptak, A D Roeder, M Stephens, Y Gilad, S Paabo, and M Przeworski. Absence of the TAP2 human recombination hotspot in chimpanzees. PLoS Biology 2(6):849-855, 2004.

S K Wasser, A M Shedlock, K Comstock, E A Ostrander, B Mutayoba, and M Stephens. Assigning African elephant DNA to geographic region of origin: applications to the ivory trade. Proc Natl Acad Sci U S A 41:14844-14852, 2004. Software


N Li and M Stephens. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics 165(4)2213-2233, 2003.

M Stephens and P Donnelly. A comparison of Bayesian methods for haplotype reconstruction from population genotype data. Am J Hum Genet 73(5):1162-1169, 2003.

D Falush, M Stephens, and J K Pritchard. Inference of population structure from multilocus genotype data: linked loci and correlated allele frequencies. Genetics 164:1567-1587. 2003.

D Falush, T Wirth, D Linz, J K Pritchard, M Stephens, M Kidd, M J Blaser, D Y Graham, S Vacher, G I Perez-Perez, Y Yamoka, F Megraud, K Otto, U Reichard, E Katzowitsch, X Wang, M Achtman, and S Suerbaum. Traces of human migrations in Heliobacter pylori populations. Science 299(5612):1582-1585, 2003.

M Stephens and P Donnelly. Ancestral inference in population genetics models with selection. Australian and New Zealand Journal of Statistics 45:901-931, 2003.


M Stephens, N J Smith, and P Donnelly. A new statistical method for haplotype reconstruction from population data. Am J Hum Genet 68(4):978-989, 2001.


J K Pritchard, M Stephens, and P Donnelly. Inference of population structure using multilocus genotype data. Genetics 155(2):945-959, 2000.

J K Pritchard, M Stephens, N A Rosenberg, and P Donnelly. Association mapping in structured populations. Am J Hum Genet 67:170-181, 2000.

M Stephens and P Donnelly. Inference in molecular population genetics. J R Stat Soc, Ser B 62:605-655, 2000.

M Stephens. Dealing with label-switching in mixture models. J R Stat Soc, Ser B 62:795-809, 2000.

M Stephens. Bayesian analysis of mixtures with an unknown number of components - an alternative to reversible jump methods. Annals of Statistics 28(1), 2000.

M Stephens. Times on trees and the age of an allele. Theoretical Population Biology 57:109-119, 2000.


M Stephens. Problems with computational methods in population genetics. Bulletin of the International Statistical Institute 52nd session, 1999.

Matthew's Theses

M Stephens. Bayesian Methods for Mixtures of Normal Distributions. DPhil thesis - University of Oxford.

M Stephens. The Results of Gregor Mendel - An analysis and comparison with the results of other researchers. Dip Stat thesis - University of Cambridge.

Student Theses

A Fu. Models and Inference of Transmission of DNA Methylation Patterns in Mammalian Somatic Cells. PhD thesis - University of Washington, 2008.

G Hellenthal. Exploring Rates and Patterns of Variability in Gene Conversion and Crossover in the Human Genome. PhD thesis - University of Washington, 2006.

Q Li. Statistical Methods for Peptide and Protein Identification using Mass Spectrometry. PhD thesis - University of Washington, 2008.

D Petkova. Inferring Effective Migration from Geographically Indexed Genetic Data. PhD thesis - University of Chicago, 2013.

P Scheet. A Flexible and Computationally Tractable Model for Patterns of Population Genetic Variation. PhD thesis - University of Washington, 2006.

S Urbut. Flexible statistical methods for jointly modeling effects. PhD thesis - University of Chicago, 2017.

Z Xing. Poisson Multiscale Methods for High-throughput Sequencing Data. PhD thesis - University of Chicago, 2016.

K Xu. EbayesThresh with Heterogeneous Variance. Masters thesis - University of Chicago, 2017.

W Wang. Applications of adaptive shrinkage in multiple statistical problems. PhD thesis - University of Chicago, 2017.

X Wen. Bayesian Analysis of Genetic Association Data, Accounting for Heterogeneity. PhD thesis - University of Chicago, 2011.

X Zhu. A Bayesian large-scale multiple regression model for genome-wide association summary statistics. PhD thesis - University of Chicago, 2017.

© 2009-2015 Matthew Stephens
Original CSS template design by Andreas Viklund