![]() Also, the derivation of the posteriors of indicator variables afterwards in shrinkage models usually heavily depends on user-defined cut-off values to judge the QTLs. While marker-specific Bayes factors can be derived for models with indicator variables, including them in the shrinkage models can make the resulting Bayes factors suffer from a double shrinkage effect. The proposed techniques are relevant also in other contexts where LASSO is used for variable selection.Ĭontrolling false positives: One approach for controlling false positives is to use Bayes factors. We consider four problems that need special attention: (i) controlling false positives, (ii) collinearity among explanatory variables, (iii) multiple comparisons and (iv) the choice of the tuning parameter that controls the amount of shrinkage and the sparsity of the estimates. ![]() The methods developed in this paper provide novel tools for finding relevant genomic markers using the Bayesian LASSO, a shrinkage-based method that uses a Laplacian prior distribution for marker effects. While Bayesian shrinkage models are common tools for genomic prediction, rigorous decision making in QTL mapping studies is still an open research problem with such models. In the latter technique, each marker has its own auxiliary indicator variable that has a much smaller prior probability for marker inclusion than marker exclusion. The former method specifies a marker effect prior that shrinks the effects of negligible markers heavily towards zero, while assigning large effects to important markers. Variable selection in these models typically uses either shrinkage analysis or auxiliary indicator variables (i.e., slab-and-spike variable selection). Variable selection regularizes the problem so that the number of estimated non-zero marker effects hopefully becomes smaller than the number of observations leading to meaningful but downwardly biased effect estimates.īayesian variable selection methods have been widely applied for QTL mapping and genomic prediction. In these studies, all markers are included in the model a priori and variable selection is applied to arrive at a “sparse” subset of trait-associated marker effects a posteriori. In genomic prediction, the number of markers included in the model depends on the genetic architecture and the extent of collinearity between markers. Mapping studies focus on finding a few major genes called quantitative trait loci (QTL) out of a large number of markers. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.Ĭompeting interests: The authors have declared that no competing interests exist.Ī large number of markers, segments of the DNA molecule, are commonly available in genetic studies involving association mapping and genomic prediction. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are creditedĭata Availability: The Matlab codes used for computations together with instructions for its use and examples are available at /general/SSBLASSO.zip.įunding: Funding for the research was provided by the Academy of Finland grant no. Received: Accepted: DecemPublished: April 9, 2015Ĭopyright: © 2015 Pasanen et al. The methods are illustrated using two sets of artificial data and one real data set, all representing typical settings in association genetics.Ĭitation: Pasanen L, Holmström L, Sillanpää MJ (2015) Bayesian LASSO, Scale Space and Decision Making in Association Genetics. The effect estimates and the associated inference are considered for all tuning parameters in the selected range and the results are visualized with color maps that provide useful insights into data and the association problem considered. Finally, whereas in Bayesian LASSO the tuning parameter is often regarded as a random variable, we adopt a scale space view and consider a whole range of fixed tuning parameters, instead. We propose to solve this problem by considering not only individual effects but also their functionals (i.e. Bayesian LASSO also tends to distribute an effect among collinear variables, making detection of an association difficult. We propose to solve the multiple comparisons problem by using simultaneous inference based on the joint posterior distribution of the effects. We separate the true associations from false positives using the posterior distribution of the effects (regression coefficients) provided by Bayesian LASSO. ![]()
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |