<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Matt Simpson</title>
	<atom:link href="http://www.themattsimpson.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.themattsimpson.com</link>
	<description>PhD student in statistics and economics, Iowa State University</description>
	<lastBuildDate>Thu, 11 Apr 2013 19:45:10 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Using R for Experimental Economics</title>
		<link>http://www.themattsimpson.com/2013/04/11/using-r-for-experimental-economics/</link>
		<comments>http://www.themattsimpson.com/2013/04/11/using-r-for-experimental-economics/#comments</comments>
		<pubDate>Thu, 11 Apr 2013 19:32:31 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[econ]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Rstudio]]></category>

		<guid isPermaLink="false">http://www.themattsimpson.com/?p=218</guid>
		<description><![CDATA[I'm giving a short talk to the experimental economics class I'm taking on how to use R to do the most common tests. This is a short tutorial in order to get R up and running with R studio. I &#8230; <a class="more-link" href="http://www.themattsimpson.com/2013/04/11/using-r-for-experimental-economics/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>I'm giving a short talk to the experimental economics class I'm taking on how to use R to do the most common tests. This is a short tutorial in order to get R up and running with R studio. I assume that you've followed this tutorial already for my talk. If you're looking for my slides from the talk or the dataset I worked with, they're at the bottom of this page along with some other useful resources.</p>
<p>Let's get started. First, go to R's <a href="http://www.r-project.org/" target="_blank">website</a>, which should look something like this:</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/Rcran.png" alt="" /></p>
<p>On the left hand side, click on the CRAN link under "Download, Packages" next to the arrow in the image above. This will bring up a page of links to download R from, like the image below. Choose a location close to you so that the download is faster.</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/Rcran2.png" alt="" /></p>
<p>This should bring you to a page that looks like the image below. Now click on the link to download R for whatever operating system you're running. I.e. if your computer has Windows, click on "Download R for Windows."</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/Rcran3.png" alt="" /></p>
<p>This should bring up another page with a few more links, like the picture below. Click on "base."</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/Rcran4.png" alt="" /></p>
<p>Finally we get to the page with the actual download link, like below. Click on "Download R Z for X" where X is whatever operating system you have and Z is the latest version of R, i.e. some number like "2.15.1". At the time I created this post, it was "Download R 2.15.1 for Windows" since I was installing R on a windows machine.</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/Rcran5.png" alt="" /></p>
<p>This will download the executable to install R - a file named "R-2.15.1-win.exe" or something similar. Make sure you save it somewhere you can find it easily. When it finishes, run the file. Just follow the on-screen instructions that pop up. You shouldn't have to change anything in order for R to properly install.</p>
<p>Now you're all set to start using R... except the GUI that R comes with out of the box isn't very good. <a href="http://rstudio.org/" target="_blank">Rstudio</a> is a free IDE that improves on the base R GUI substantially. Go <a href="http://rstudio.org/download/" target="_blank">here</a> to download it. Download the version of Rstudio that their website recommends for your machine somewhere that you can easily find. Once this completes, open the file - it should be called "RStudio-0.96.316.exe" or something similar. From this point, just follow the on-screen instructions to complete the installation.</p>
<p>Now we'll install a couple of useful packages that exist for R. First we'll install the R package "ggplot2". ggplot2 is a package for creating statistical graphics that drastically improves upon R's base graphics. </p>
<p>In order to install this, open up R Studio and make sure you're connected to the internet. Then type install.packages("ggplot2") into the R console and hit enter, as below.</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/Rggplot21.png" alt="" /></p>
<p>R will output the following message, or something similar:</p>
<blockquote><p>Installing package(s) into ‘C:/Users/Matt/R/win-library/2.15’<br />
(as ‘lib’ is unspecified)<br />
--- Please select a CRAN mirror for use in this session ---</p></blockquote>
<p>Wait for a few seconds, then R will give you some options for which mirror to download from. Type the number for the mirror that is closest to you to download everything faster, then press enter. See the picture below.</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/Rggplot22.png" alt="" /></p>
<p>R should be pretty busy for a few minutes while it downloads and installs several packages that ggplot2 depends on as well as ggplot2 itself. Once it finishes and you see the blue ">" in the bottom of the R console, the packages are installed.</p>
<p>There's another package you need to install using basically the same process. agricolae is a package that has a bunch of methods for agricultural statistics, but more importantly it has some useful nonparametric tests. To install it, type install.packages("agricolae") into the R console and follow the same process as before.</p>
<p>In addition, I've uploaded a dataset that we'll be using during my presentation. Download it <a href="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/diam.csv" target="_blank">here</a>. Save it as diam.csv somewhere you can easily find it.</p>
<p>That's all the software you'll need to be up and running! Here are a bunch of useful resources, including the slides from my talk:</p>
<p><a href="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/StatTalk.pdf" target="_blank">Slides from the presentation</a>. (pdf) Probably more useful while you're sitting at your computer than while I was talking.</p>
<p><a href="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/diam.csv" target="_blank">diam.csv</a>. The dataset I used for most examples in my presentation.</p>
<p><a href="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/demo.r" target="_blank">An R script containing every R command from my presentation</a>. Open with any text editor, though opening it in R Studio is best.</p>
<p><a href="http://stat.ethz.ch/consulting/Statistiksoftware/R/RStudio.pdf" target="_blank">R Studio tutorial</a>. (pdf) Useful for getting your bearings in R while using R Studio. It covers the basics of computation R, including some stuff I didn't cover such as dealing with vectors and matrices.</p>
<p><a href="http://simpsonm.public.iastate.edu/BlogPosts/ExpEconStat/presentation.r" target="_blank">An R script containing an old presentation</a>. This one has many more details about the basics in R as well as using ggplot2, plus some stuff about quickly using Bayesian methods to fit models. Note: enter install.packages("arm") into the R console to use the Bayesian stuff.</p>
<p><a href="http://cran.r-project.org/doc/contrib/Short-refcard.pdf" target="_blank">An R reference card</a>. (pdf) Print it out and tape it on the wall next to your desk. Seriously. Do it now.</p>
<p><a href="http://had.co.nz/ggplot2/" target="_blank">The ggplot2 website</a>. This contains useful information for making complicated, informative and pretty graphics using the ggplot2 package. </p>
<p><a href="http://www.public.iastate.edu/~hofmann/stat579/" target="_blank">Course website for the class I took to learn R</a>. Some overlap, but there are many new things.</p>
<p><a href="http://yihui.name/knitr/" target="_blank">Knitr</a>. This is a fantastic way to integrate your computations from R into a nice compiled Latex file, and it's relatively painless with R Studio.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.themattsimpson.com/2013/04/11/using-r-for-experimental-economics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Introduction to Bayesian Statistics Videos</title>
		<link>http://www.themattsimpson.com/2013/03/22/introduction-to-bayesian-statistics-videos/</link>
		<comments>http://www.themattsimpson.com/2013/03/22/introduction-to-bayesian-statistics-videos/#comments</comments>
		<pubDate>Fri, 22 Mar 2013 03:12:11 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://www.themattsimpson.com/?p=212</guid>
		<description><![CDATA[My advisor, Jarad Niemi, is posting lectures on Bayesian statistics on his youtube channel while teaching Stat 544 - the master's/Ph.D. level introduction to Bayesian statistics at Iowa State University. I've taken a look at a few of them and &#8230; <a class="more-link" href="http://www.themattsimpson.com/2013/03/22/introduction-to-bayesian-statistics-videos/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>My advisor, <a href="http://niemiconsulting.com/" title="http://niemiconsulting.com/" target="_blank">Jarad Niemi</a>, is posting lectures on Bayesian statistics on his <a href="https://www.youtube.com/user/jaradniemi?feature=g-high-u" title="https://www.youtube.com/user/jaradniemi?feature=g-high-u" target="_blank">youtube channel</a> while teaching Stat 544 - the master's/Ph.D. level introduction to Bayesian statistics at Iowa State University. I've taken a look at a few of them and they're pretty good. Most are short (~10 minute) explanations of a particular topic in order to supplement required readings. Some of them, like the extended Metropolis within Gibbs example, are full lecture length, i.e. about one hour. Since the course is ongoing this semester, you can expect a few more to be posted.</p>
<p>The lectures do assume a rather high level understanding of probability theory. The statistics students in the class have seen at least chapters 1 - 5 of <a href="http://www.amazon.com/Statistical-Inference-George-Casella/dp/0534243126" title="http://www.amazon.com/Statistical-Inference-George-Casella/dp/0534243126" target="_blank">Casella and Berger</a> in some detail. Other students in the class have similar backgrounds, though perhaps not quite as strong. Some knowledge of R would also be useful to understand the more computationally centered lectures. While the videos might not be useful for everyone, they're probably a great supplement if you're learning some of this material elsewhere.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.themattsimpson.com/2013/03/22/introduction-to-bayesian-statistics-videos/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Prior distributions for covariance matrices: the scaled inverse-Wishart prior</title>
		<link>http://www.themattsimpson.com/2012/08/20/prior-distributions-for-covariance-matrices-the-scaled-inverse-wishart-prior/</link>
		<comments>http://www.themattsimpson.com/2012/08/20/prior-distributions-for-covariance-matrices-the-scaled-inverse-wishart-prior/#comments</comments>
		<pubDate>Mon, 20 Aug 2012 16:30:30 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Bayes]]></category>
		<category><![CDATA[covariance]]></category>
		<category><![CDATA[priors]]></category>

		<guid isPermaLink="false">http://www.themattsimpson.com/?p=113</guid>
		<description><![CDATA[At some point in a variety of Bayesian applications, we end up having to specify a prior distribution on a covariance matrix. The canonical example is a hierarchical regression model. Suppose we have a response vector along with some covariate &#8230; <a class="more-link" href="http://www.themattsimpson.com/2012/08/20/prior-distributions-for-covariance-matrices-the-scaled-inverse-wishart-prior/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>At some point in a variety of Bayesian applications, we end up having to specify a prior distribution on a covariance matrix. The canonical example is a hierarchical regression model. Suppose we have a response vector <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_415290769594460e2e485922904f345d.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="y" /></span><script type='math/tex'>y</script> along with some covariate <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_9dd4e461268c8034f5c8564e155c67a6.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="x" /></span><script type='math/tex'>x</script> where we expect the simple linear regression model to provide a decent fit. In addition, suppose the data is organized into distinct groups such that each group possibly has its own regression line, though we expect the lines to be related. In order to model this situation, we start first with the data level:</p>
<p><p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_6f99c7d93463d2cfe688b708a05a70d0.gif' style='vertical-align: middle; border: none;' class='tex' alt="y_{ij} | \alpha_j, \beta_j \stackrel{iid}{\sim} N(\alpha_j + x_{ij}\beta_j, \sigma^2)" /></span><script type='math/tex;  mode=display'>y_{ij} | \alpha_j, \beta_j \stackrel{iid}{\sim} N(\alpha_j + x_{ij}\beta_j, \sigma^2)</script></p></p>
<p>where <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_f97309ba8943d1077addf5695fdd7065.gif' style='vertical-align: middle; border: none; ' class='tex' alt="j=1,...,J" /></span><script type='math/tex'>j=1,...,J</script> indicates groups and <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_92f395d7600cd0ea8aa40593abd8daba.gif' style='vertical-align: middle; border: none; ' class='tex' alt="i=1,...,I_j" /></span><script type='math/tex'>i=1,...,I_j</script> indicates observations within groups. In other words, each group has a simple linear regression with a common error variance <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_10e16c6a764d367ca5077a54bf156f7e.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="\sigma^2" /></span><script type='math/tex'>\sigma^2</script>. We further model the regression coefficients as coming from a common distribution:</p>
<p><p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_bbfae4fb055462692c07c858d2329569.gif' style='vertical-align: middle; border: none;' class='tex' alt="\begin{pmatrix} \alpha_j \\ \beta_j \end{pmatrix} \stackrel{iid}{\sim} N\left(\begin{pmatrix} \alpha_0 \\ \beta_0 \end{pmatrix}, {\mathbf \Sigma}\right)" /></span><script type='math/tex;  mode=display'>\begin{pmatrix} \alpha_j \\ \beta_j \end{pmatrix} \stackrel{iid}{\sim} N\left(\begin{pmatrix} \alpha_0 \\ \beta_0 \end{pmatrix}, {\mathbf \Sigma}\right)</script></p></p>
<p>In order to complete the model and run a Bayesian analysis, we need to add priors for all remaining parameters including the covariance matrix for a group's regression coefficients <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_93f82be8ea0b85358dbe918a71a846af.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="{\mathbf \Sigma}" /></span><script type='math/tex'>{\mathbf \Sigma}</script>. A popular prior for <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_93f82be8ea0b85358dbe918a71a846af.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="{\mathbf \Sigma}" /></span><script type='math/tex'>{\mathbf \Sigma}</script> is the <a href="http://en.wikipedia.org/wiki/Inverse-Wishart_distribution" title="http://en.wikipedia.org/wiki/Inverse-Wishart_distribution" target="_blank">inverse-Wishart</a> distribution, but there are some problems with using it in this role. A <a href="http://dahtah.wordpress.com/2012/03/07/why-an-inverse-wishart-prior-may-not-be-such-a-good-idea/" title="http://dahtah.wordpress.com/2012/03/07/why-an-inverse-wishart-prior-may-not-be-such-a-good-idea/" target="_blank">post</a> over at <a href="http://dahtah.wordpress.com/" title="http://dahtah.wordpress.com/" target="_blank">dahtah</a> has the details, but essentially the problem is that using the standard "noninformative" version of the inverse-Wishart prior, which makes the marginal distribution of the correlations uniform, large standard deviations are associated with large absolute correlations. This isn't exactly noninformative and in addition it can have adverse affects on inference in some models where we want shrinkage to occur.</p>
<p>The question, then, is what prior distribution should we use instead? A <a href="http://simpsonm.public.iastate.edu/ScaledInverseWishart/barnard.pdf" title="http://simpsonm.public.iastate.edu/ScaledInverseWishart/barnard.pdf" target="_blank">paper</a> (pdf) by Barnard, McCulloch and Meng argue for using a separation strategy: write <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_93f82be8ea0b85358dbe918a71a846af.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="{\mathbf \Sigma}" /></span><script type='math/tex'>{\mathbf \Sigma}</script> as <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_4fd34b8fa5dbe5e85d7bc8289c3eab31.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathbf{\Delta\Omega\Delta}" /></span><script type='math/tex'>\mathbf{\Delta\Omega\Delta}</script> where <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_aeefdcdd2c3d5e2889650df002ab9821.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathbf{\Delta}=\mathrm{diag}(\mathbf{\delta})" /></span><script type='math/tex'>\mathbf{\Delta}=\mathrm{diag}(\mathbf{\delta})</script> is a diagonal matrix of standard deviations and <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_f699fbc68f6c08e9aa52abdb067dffad.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="{\mathbf \Omega}" /></span><script type='math/tex'>{\mathbf \Omega}</script> is a correlation matrix. Then model <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_48e36e465de0708e276544b2d89f648c.gif' style='vertical-align: middle; border: none; ' class='tex' alt="{\mathbf \Delta}" /></span><script type='math/tex'>{\mathbf \Delta}</script> and <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_f699fbc68f6c08e9aa52abdb067dffad.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="{\mathbf \Omega}" /></span><script type='math/tex'>{\mathbf \Omega}</script> separately. There are a number of ways to do this, but Barnard et al. suggest using independent lognormal priors on the standard deviations in <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_0820affaacfa3621637420a8a87c849a.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathbf{\delta}" /></span><script type='math/tex'>\mathbf{\delta}</script>:</p>
<p><p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_eca44df47a416bacd5144329a9889b13.gif' style='vertical-align: middle; border: none;' class='tex' alt="\delta_j\stackrel{iid}{\sim}N(\delta_0,\sigma^2_\delta)" /></span><script type='math/tex;  mode=display'>\delta_j\stackrel{iid}{\sim}N(\delta_0,\sigma^2_\delta)</script></p>  </p>
<p>while using the following family of densities on <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_80c34080ca237effb36afd8356a378a8.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="\mathbf{\Omega}" /></span><script type='math/tex'>\mathbf{\Omega}</script>:</p>
<p><p style='text-align:center;'><span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_cd7afcd64bc4d452dc2be5ce28b6dc51.gif' style='vertical-align: middle; border: none;' class='tex' alt=" f_k(\mathbf{\Omega}|v) \propto |\mathbf{\Omega}|^{-\frac{1}{2}(v+k_1)}\left(\prod_i\omega^{ii}\right)^{-\frac{v}{2}} = |\mathbf{\Omega}|^{\frac{1}{2}(v-1)(k-1)-1}\left(\prod_i|\mathbf{\Omega}_{ii}|\right)^{-\frac{v}{2}} " /></span><script type='math/tex;  mode=display'> f_k(\mathbf{\Omega}|v) \propto |\mathbf{\Omega}|^{-\frac{1}{2}(v+k_1)}\left(\prod_i\omega^{ii}\right)^{-\frac{v}{2}} = |\mathbf{\Omega}|^{\frac{1}{2}(v-1)(k-1)-1}\left(\prod_i|\mathbf{\Omega}_{ii}|\right)^{-\frac{v}{2}} </script></p> </p>
<p>where <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_5301165f8049eeef69a61c98c5b433f9.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\omega^{ii}" /></span><script type='math/tex'>\omega^{ii}</script> is the <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_865c0c0b4ab0e063e5caa3387c1a8741.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="i" /></span><script type='math/tex'>i</script>'th diagonal element of <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_ef4d2f4d6dd77b1149898d4a1b90c238.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathbf{\Omega}^{-1}" /></span><script type='math/tex'>\mathbf{\Omega}^{-1}</script>, <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_6b18e3d1bb1c08c4ca3651da93cbb757.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathbf{\Omega}_{ii}" /></span><script type='math/tex'>\mathbf{\Omega}_{ii}</script> is the <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_865c0c0b4ab0e063e5caa3387c1a8741.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="i" /></span><script type='math/tex'>i</script>'th leading principle sub-matrix of <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_80c34080ca237effb36afd8356a378a8.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="\mathbf{\Omega}" /></span><script type='math/tex'>\mathbf{\Omega}</script>, <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_8ce4b16b22b58894aa86c421e8759df3.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="k" /></span><script type='math/tex'>k</script> is the dimension of <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_80c34080ca237effb36afd8356a378a8.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="\mathbf{\Omega}" /></span><script type='math/tex'>\mathbf{\Omega}</script> and <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_9e3669d19b675bd57058fd4664205d2a.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="v" /></span><script type='math/tex'>v</script> is a tuning parameter. This density is obtained by transforming an inverse-Wishart random matrix, specifically <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_49a282fe1f7d589465dae0dd29cedc35.gif' style='vertical-align: middle; border: none; ' class='tex' alt="IW(v, \mathbf{I})" /></span><script type='math/tex'>IW(v, \mathbf{I})</script>, into a correlation matrix. It turns out that <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_84692cd7951059b2c69f53c304177f8e.gif' style='vertical-align: middle; border: none; ' class='tex' alt="v=k+1" /></span><script type='math/tex'>v=k+1</script> results in uniform marginal distributions on the correlations in <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_80c34080ca237effb36afd8356a378a8.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="\mathbf{\Omega}" /></span><script type='math/tex'>\mathbf{\Omega}</script>.</p>
<p>An alternative strategy based on the separation strategy comes from a <a href="http://simpsonm.public.iastate.edu/ScaledInverseWishart/omalley.pdf" title="http://simpsonm.public.iastate.edu/ScaledInverseWishart/omalley.pdf" target="_blank">paper</a> (pdf) by O'Malley and Zaslavsky and <a href="http://andrewgelman.com/2009/08/constructing_a/" title="http://andrewgelman.com/2009/08/constructing_a/">endorsed</a> by Gelman - instead of modeling <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_f699fbc68f6c08e9aa52abdb067dffad.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="{\mathbf \Omega}" /></span><script type='math/tex'>{\mathbf \Omega}</script> as a correlation matrix, only constrain it to be positive semi-definite so that <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_48e36e465de0708e276544b2d89f648c.gif' style='vertical-align: middle; border: none; ' class='tex' alt="{\mathbf \Delta}" /></span><script type='math/tex'>{\mathbf \Delta}</script> and <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_f699fbc68f6c08e9aa52abdb067dffad.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="{\mathbf \Omega}" /></span><script type='math/tex'>{\mathbf \Omega}</script> jointly determine the standard deviations, but <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_f699fbc68f6c08e9aa52abdb067dffad.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="{\mathbf \Omega}" /></span><script type='math/tex'>{\mathbf \Omega}</script> still determines the correlations alone. This strategy uses the same lognormal prior on <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_3eb383281927193b89b53dc378e8a041.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathbf{\Delta}" /></span><script type='math/tex'>\mathbf{\Delta}</script> but uses the inverse-Wishart distribution <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_2d9b4813f39550c0cf3c201676c01c3b.gif' style='vertical-align: middle; border: none; ' class='tex' alt="IW(k+1, \mathbf{I})" /></span><script type='math/tex'>IW(k+1, \mathbf{I})</script> on <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_80c34080ca237effb36afd8356a378a8.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="\mathbf{\Omega}" /></span><script type='math/tex'>\mathbf{\Omega}</script> which still induces marginally uniform correlations on the resulting covariance matrix. This strategy is attractive since the inverse-Wishart distribution is already in a number of statistical packages and typically results in a conditionally conjugate model for the covariance matrix, allowing for a relatively simple analysis. If <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_80c34080ca237effb36afd8356a378a8.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="\mathbf{\Omega}" /></span><script type='math/tex'>\mathbf{\Omega}</script> is constrained to be a correlation matrix as in Barnard et al., sampling from the conditional posterior requires significantly more work. So the scaled inverse-Wishart is a much easier to work with, but theoretically it still allows for some dependence between the correlations and the variances in <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_f8b95df9c23c544eebaba4d0b2868cae.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="\mathbf{\Sigma}" /></span><script type='math/tex'>\mathbf{\Sigma}</script>.</p>
<p>I've used the scaled inverse-Wishart before, so I was curious about how much a difference it makes to use Barnard et al.'s prior. Now the impact on inference of using different priors will vary from model to model, but we can at least see how different priors encode the same prior information. Since <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_8fa14cdd754f91cc6554c9e71929cce7.gif' style='vertical-align: middle; border: none; ' class='tex' alt="f" /></span><script type='math/tex'>f</script> is just the density of the correlation matrix resulting from an inverse-Wishart distribution on the covariance matrix, it's relatively easy to sample from <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_8fa14cdd754f91cc6554c9e71929cce7.gif' style='vertical-align: middle; border: none; ' class='tex' alt="f" /></span><script type='math/tex'>f</script> - just sample from an inverse-Wishart and transform to correlations. So I specified Barnard's separation strategy prior (SS) and the scaled inverse-Wishart prior (sIW) in a similar way and sampled from them both, and then compared them to a sample from a standard inverse-Wishart (IW). In the first test, I assumed that the covariance matrix was <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_b6afe64110f5c78ec57c6cc87f09efb4.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="2\times2" /></span><script type='math/tex'>2\times2</script> and that <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_49916ea1f36756411f005d5d260b7026.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathrm{log}(\delta_j)\stackrel{iid}{\sim}N(0,1)" /></span><script type='math/tex'>\mathrm{log}(\delta_j)\stackrel{iid}{\sim}N(0,1)</script> for both the SS and sIW priors. The only difference was in <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_80c34080ca237effb36afd8356a378a8.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="\mathbf{\Omega}" /></span><script type='math/tex'>\mathbf{\Omega}</script>: for the SS prior I assumed that <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_37da627738077f75f3c0ba9a2b045ba2.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathbf{\Omega}\sim f(\mathbf{\Omega}|v=3)" /></span><script type='math/tex'>\mathbf{\Omega}\sim f(\mathbf{\Omega}|v=3)</script> and for the sIW prior, I assumed <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_b6adba2648980dbac62e97070dbdbf66.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathbf{\Omega}\sim IW(3, \mathbf{I})" /></span><script type='math/tex'>\mathbf{\Omega}\sim IW(3, \mathbf{I})</script>. In other words, for both priors I assumed that the correlations were marginally uniform. For the IW prior, I assumed that <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_2dd449d53fdb6614886d8f96bb855c28.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathbf{\Sigma}\sim IW(3, \mathbf{I})" /></span><script type='math/tex'>\mathbf{\Sigma}\sim IW(3, \mathbf{I})</script> to once again ensure marginally uniform correlations as a point of comparison. Taking 10,000 samples from each prior, we can see this behavior in the following plot of the correlations:</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ScaledInverseWishart/corr1.png" alt="Histograms of correlation by prior" /></p>
<p>As expected for all priors the correlations look uniform. With either the SS or sIW priors, it's possible to change this so that they favor either high correlation (closer to <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_c4ca4238a0b923820dcc509a6f75849b.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="1" /></span><script type='math/tex'>1</script> or <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_6bb61e3b7bce0931da574d19d1d82c88.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="-1" /></span><script type='math/tex'>-1</script>) or low correlation (closer to <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_cfcd208495d565ef66e7dff9f98764da.gif' style='vertical-align: middle; border: none; padding-bottom:1px;' class='tex' alt="0" /></span><script type='math/tex'>0</script>) through the manipulation of <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_9e3669d19b675bd57058fd4664205d2a.gif' style='vertical-align: middle; border: none; padding-bottom:2px;' class='tex' alt="v" /></span><script type='math/tex'>v</script> - see Barnard et al. or O'malley &#038; Zaslavsky for details. Next, we take a look at the log standard deviations:</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ScaledInverseWishart/logSD1.png" alt="Histograms of log standard deviations by prior and variance component" /></p>
<p>The histograms of SS and sIW look pretty similar with the only difference between the two that sIW has slightly fatter tails and/or a higher variance. In other words, given the same parameter choices sIW yields a slightly less informative prior. This isn't a problem - once we understand the behavior of sIW vs. SS, we can set the priors on <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_3eb383281927193b89b53dc378e8a041.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathbf{\Delta}" /></span><script type='math/tex'>\mathbf{\Delta}</script> differently to encode the same prior information on the standard deviations. The priors as specified don't appear to be satisfactorily noninformative - if that were the prior information we were trying to capture, but that can easily be fixed by increasing the variance of the prior. The IW prior, on the other hand, has a very narrow range of values for the standard deviation that can only be changed by also changing the prior on the correlations - one of its main drawbacks. Finally, we look at the dependence between the first variance component and the correlation coefficient in the next plot:</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ScaledInverseWishart/corrVvar1.png" alt="Scatterplot of correlation coefficient vs. first variance component by prior" /></p>
<p>As expected, there's no dependence between the correlation and the variance in the SS prior - the prior assumes they are independent. The sIW prior, on the other hand, exhibits the same disturbing dependence as the IW prior, also documented in Simon Barthelmé's <a href="http://dahtah.wordpress.com/2012/03/07/why-an-inverse-wishart-prior-may-not-be-such-a-good-idea/" title="http://dahtah.wordpress.com/2012/03/07/why-an-inverse-wishart-prior-may-not-be-such-a-good-idea/" target="_blank">post</a> at dahtah. High variances are associated with more extreme correlations in both the sIW and IW priors - the sIW prior doesn't seem to improve on the IW prior at all in this respect.</p>
<p>I changed the prior on <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_4c852388bf40919e11d9a9237b93dfe5.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathrm{log}(\mathbf{\delta})" /></span><script type='math/tex'>\mathrm{log}(\mathbf{\delta})</script> to have mean <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_1512e88ad9e3889a67a37bccbe9190ce.gif' style='vertical-align: middle; border: none; ' class='tex' alt="(10, 0)" /></span><script type='math/tex'>(10, 0)</script> and covariance matrix <span class='MathJax_Preview'><img src='http://www.themattsimpson.com/wp-content/plugins/latex/cache/tex_759bab53521dc6042abae9a2ff383f63.gif' style='vertical-align: middle; border: none; ' class='tex' alt="\mathrm{diag}(.2, .2)" /></span><script type='math/tex'>\mathrm{diag}(.2, .2)</script> reflecting a situation where we have a fairly strong prior belief that the two standard deviations are different from each other. Without commentary, the plots tell essentially the same story:</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ScaledInverseWishart/corr2.png" alt="Histograms of correlation by prior" /></p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ScaledInverseWishart/logSD2.png" alt="Histograms of log standard deviations by prior and variance component" /></p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/ScaledInverseWishart/corrVvar2.png" alt="Scatterplot of correlation coefficient vs. first variance component by prior" /></p>
<p>It doesn't look great for the scaled inverse-Wishart. This prior is capable of encoding a wider range of prior information than the standard inverse-Wishart by allowing the modeler to separately model the correlations and the variances. However, the modeler isn't able to control the prior dependence between the variances and the correlations - the level of dependence appears to be the same as in the standard inverse-Wishart prior. I suspect that for most researchers the problems this causes aren't so bad when weighed against the computational issues that arise when trying to simulate from a correlation matrix, but it's hard to tell without actually fitting models in order to determine the effect on shrinkage.</p>
<p>Finally, here's some R code to generate the plots in this post:</p>

<div class="wp_codebox"><table><tr id="p1132"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
</pre></td><td class="code" id="p113code2"><pre class="rsplus" style="font-family:monospace;"><span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>MCMCpack<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span>ggplot2<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">require</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">reshape</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;">## simulates a single sample from the sIW prior</span>
sIW.<span style="">sim</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>k, m<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,k<span style="color: #080;">&#41;</span>, s<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,k<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">df</span><span style="color: #080;">=</span>k<span style="color: #080;">+</span><span style="color: #ff0000;">1</span>, M<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">diag</span><span style="color: #080;">&#40;</span>k<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
    R <span style="color: #080;">&lt;-</span> riwish<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span>, M<span style="color: #080;">&#41;</span>
    S <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">diag</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">exp</span><span style="color: #080;">&#40;</span>mvrnorm<span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,m, <span style="color: #0000FF; font-weight: bold;">diag</span><span style="color: #080;">&#40;</span>s<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
    SRS <span style="color: #080;">&lt;-</span> S<span style="color: #080;">%*%</span>R<span style="color: #080;">%*%</span>S
    <span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span>SRS<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
<span style="color: #228B22;">## simulates n samples from the sIW prior</span>
sIW.<span style="">test</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>n, k, m<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,k<span style="color: #080;">&#41;</span>, s<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,k<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">df</span><span style="color: #080;">=</span>k<span style="color: #080;">+</span><span style="color: #ff0000;">1</span>, M<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">diag</span><span style="color: #080;">&#40;</span>k<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
    sig1 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,n<span style="color: #080;">&#41;</span>
    sig2 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,n<span style="color: #080;">&#41;</span>
    rho <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,n<span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span>i <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span>n<span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
        E <span style="color: #080;">&lt;-</span> sIW.<span style="">sim</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">2</span>, m, s, <span style="color: #0000FF; font-weight: bold;">df</span>, M<span style="color: #080;">&#41;</span>
        sds <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sqrt</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">diag</span><span style="color: #080;">&#40;</span>E<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
        rho<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cov2cor</span><span style="color: #080;">&#40;</span>E<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
        sig1<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> sds<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
        sig2<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> sds<span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>
    <span style="color: #080;">&#125;</span>
    <span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>value<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>sig1, sig2, rho<span style="color: #080;">&#41;</span>, parameter<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;sigma1&quot;</span>,n<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;sigma2&quot;</span>,n<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;rho&quot;</span>,n<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>, sam<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span>n<span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
<span style="color: #228B22;">## simulates n samples from the SS prior</span>
SS.<span style="">test</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>n, k, m<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,k<span style="color: #080;">&#41;</span>, s<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,k<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">df</span><span style="color: #080;">=</span>k<span style="color: #080;">+</span><span style="color: #ff0000;">1</span>, M<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">diag</span><span style="color: #080;">&#40;</span>k<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
    sig1 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,n<span style="color: #080;">&#41;</span>
    sig2 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,n<span style="color: #080;">&#41;</span>
    rho <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,n<span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span>i <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span>n<span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
        E <span style="color: #080;">&lt;-</span> riwish<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span>, M<span style="color: #080;">&#41;</span>
        rho<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cov2cor</span><span style="color: #080;">&#40;</span>E<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
        sds <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">exp</span><span style="color: #080;">&#40;</span>mvrnorm<span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,m, <span style="color: #0000FF; font-weight: bold;">diag</span><span style="color: #080;">&#40;</span>s<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
        sig1<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> sds<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
        sig2<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> sds<span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span>
    <span style="color: #080;">&#125;</span>
    <span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>value<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>sig1, sig2, rho<span style="color: #080;">&#41;</span>, parameter<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;sigma1&quot;</span>,n<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;sigma2&quot;</span>,n<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;rho&quot;</span>,n<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>, sam<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span>n<span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
&nbsp;
<span style="color: #228B22;">## simulates n samples from the IW prior</span>
IW.<span style="">test</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>n, k, <span style="color: #0000FF; font-weight: bold;">df</span><span style="color: #080;">=</span>k<span style="color: #080;">+</span><span style="color: #ff0000;">1</span>, M<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">diag</span><span style="color: #080;">&#40;</span>k<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
    sig1 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,n<span style="color: #080;">&#41;</span>
    sig2 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,n<span style="color: #080;">&#41;</span>
    rho <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,n<span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">for</span><span style="color: #080;">&#40;</span>i <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span>n<span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
        E <span style="color: #080;">&lt;-</span> riwish<span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">df</span>, M<span style="color: #080;">&#41;</span>
        rho<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">cov2cor</span><span style="color: #080;">&#40;</span>E<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
        vars <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">diag</span><span style="color: #080;">&#40;</span>E<span style="color: #080;">&#41;</span>
        sig1<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sqrt</span><span style="color: #080;">&#40;</span>vars<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
        sig2<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">sqrt</span><span style="color: #080;">&#40;</span>vars<span style="color: #080;">&#91;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
    <span style="color: #080;">&#125;</span>
    <span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span><span style="color: #080;">&#40;</span>value<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>sig1, sig2, rho<span style="color: #080;">&#41;</span>, parameter<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;sigma1&quot;</span>,n<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;sigma2&quot;</span>,n<span style="color: #080;">&#41;</span>, <span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;rho&quot;</span>,n<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>, sam<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">rep</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span>n<span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #080;">&#125;</span>
&nbsp;
n <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">10000</span>
k <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">2</span>
m <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,<span style="color: #ff0000;">0</span><span style="color: #080;">&#41;</span>
s <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">df</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">3</span>
M <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">diag</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span>
&nbsp;
&nbsp;
sIWsam.1 <span style="color: #080;">&lt;-</span> sIW.<span style="">test</span><span style="color: #080;">&#40;</span>n, k, m, s, <span style="color: #0000FF; font-weight: bold;">df</span>, M<span style="color: #080;">&#41;</span>
SSsam.1 <span style="color: #080;">&lt;-</span> SS.<span style="">test</span><span style="color: #080;">&#40;</span>n, k, m, s, <span style="color: #0000FF; font-weight: bold;">df</span>, M<span style="color: #080;">&#41;</span>
IWsam.1 <span style="color: #080;">&lt;-</span> IW.<span style="">test</span><span style="color: #080;">&#40;</span>n, k, <span style="color: #0000FF; font-weight: bold;">df</span>, M<span style="color: #080;">&#41;</span>
&nbsp;
sIWsam.1$dens <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">&quot;sIW&quot;</span>
SSsam.1$dens <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">&quot;SS&quot;</span>
IWsam.1$dens <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">&quot;IW&quot;</span>
&nbsp;
data.1 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rbind</span><span style="color: #080;">&#40;</span>sIWsam.1, SSsam.1<span style="color: #080;">&#41;</span>
data.1 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rbind</span><span style="color: #080;">&#40;</span>sIWsam.1, SSsam.1, IWsam.1<span style="color: #080;">&#41;</span>
&nbsp;
qplot<span style="color: #080;">&#40;</span>value, <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">=</span>data.1<span style="color: #080;">&#91;</span>data.1$parameter<span style="color: #080;">!=</span><span style="color: #ff0000;">&quot;rho&quot;</span>,<span style="color: #080;">&#93;</span>, <span style="color: #0000FF; font-weight: bold;">log</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;x&quot;</span>, facets<span style="color: #080;">=</span>dens~parameter, xlab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Log Standard Deviation&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
qplot<span style="color: #080;">&#40;</span>value, <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">=</span>data.1<span style="color: #080;">&#91;</span>data.1$parameter<span style="color: #080;">==</span><span style="color: #ff0000;">&quot;rho&quot;</span>,<span style="color: #080;">&#93;</span>, facets<span style="color: #080;">=</span>dens~., xlab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Correlation&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
data.1.<span style="">melt</span> <span style="color: #080;">&lt;-</span> melt<span style="color: #080;">&#40;</span>data.1, id<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;parameter&quot;</span>, <span style="color: #ff0000;">&quot;dens&quot;</span>, <span style="color: #ff0000;">&quot;sam&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
data.1.<span style="">cast</span> <span style="color: #080;">&lt;-</span> cast<span style="color: #080;">&#40;</span>data.1.<span style="">melt</span>, dens<span style="color: #080;">+</span>sam~parameter<span style="color: #080;">&#41;</span>
&nbsp;
qplot<span style="color: #080;">&#40;</span>sigma1<span style="color: #080;">^</span><span style="color: #ff0000;">2</span>, rho, <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">=</span>data.1.<span style="">cast</span>, facets<span style="color: #080;">=</span>.~dens, xlab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;First variance component&quot;</span>, ylab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Correlation&quot;</span>, <span style="color: #0000FF; font-weight: bold;">log</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;x&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
n <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">10000</span>
k <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">2</span>
m <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">10</span>,<span style="color: #ff0000;">0</span><span style="color: #080;">&#41;</span>
s <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span>.2,.2<span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">df</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">3</span>
M <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">diag</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">2</span><span style="color: #080;">&#41;</span>
&nbsp;
sIWsam.2 <span style="color: #080;">&lt;-</span> sIW.<span style="">test</span><span style="color: #080;">&#40;</span>n, k, m, s, <span style="color: #0000FF; font-weight: bold;">df</span>, M<span style="color: #080;">&#41;</span>
SSsam.2 <span style="color: #080;">&lt;-</span> SS.<span style="">test</span><span style="color: #080;">&#40;</span>n, k, m, s, <span style="color: #0000FF; font-weight: bold;">df</span>, M<span style="color: #080;">&#41;</span>
IWsam.2 <span style="color: #080;">&lt;-</span> IW.<span style="">test</span><span style="color: #080;">&#40;</span>n, k, <span style="color: #0000FF; font-weight: bold;">df</span>, M<span style="color: #080;">&#41;</span>
&nbsp;
sIWsam.2$dens <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">&quot;sIW&quot;</span>
SSsam.2$dens <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">&quot;SS&quot;</span>
IWsam.2$dens <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">&quot;IW&quot;</span>
&nbsp;
&nbsp;
data.2 <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">rbind</span><span style="color: #080;">&#40;</span>sIWsam.2, SSsam.2, IWsam.2<span style="color: #080;">&#41;</span>
&nbsp;
qplot<span style="color: #080;">&#40;</span>value, <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">=</span>data.2<span style="color: #080;">&#91;</span>data.2$parameter<span style="color: #080;">!=</span><span style="color: #ff0000;">&quot;rho&quot;</span>,<span style="color: #080;">&#93;</span>, <span style="color: #0000FF; font-weight: bold;">log</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;x&quot;</span>, facets<span style="color: #080;">=</span>dens~parameter, xlab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Log Standard Deviation&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
qplot<span style="color: #080;">&#40;</span>value, <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">=</span>data.2<span style="color: #080;">&#91;</span>data.2$parameter<span style="color: #080;">==</span><span style="color: #ff0000;">&quot;rho&quot;</span>,<span style="color: #080;">&#93;</span>, facets<span style="color: #080;">=</span>dens~., xlab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Correlation&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
data.2.<span style="">melt</span> <span style="color: #080;">&lt;-</span> melt<span style="color: #080;">&#40;</span>data.2, id<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;parameter&quot;</span>, <span style="color: #ff0000;">&quot;dens&quot;</span>, <span style="color: #ff0000;">&quot;sam&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
data.2.<span style="">cast</span> <span style="color: #080;">&lt;-</span> cast<span style="color: #080;">&#40;</span>data.2.<span style="">melt</span>, dens<span style="color: #080;">+</span>sam~parameter<span style="color: #080;">&#41;</span>
&nbsp;
qplot<span style="color: #080;">&#40;</span>sigma1<span style="color: #080;">^</span><span style="color: #ff0000;">2</span>, rho, <span style="color: #0000FF; font-weight: bold;">data</span><span style="color: #080;">=</span>data.2.<span style="">cast</span>, facets<span style="color: #080;">=</span>.~dens, xlab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;First variance component&quot;</span>, ylab<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Correlation&quot;</span>, <span style="color: #0000FF; font-weight: bold;">log</span><span style="color: #080;">=</span><span style="color: #ff0000;">&quot;x&quot;</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

]]></content:encoded>
			<wfw:commentRss>http://www.themattsimpson.com/2012/08/20/prior-distributions-for-covariance-matrices-the-scaled-inverse-wishart-prior/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Setting up R for CFAR minicamp</title>
		<link>http://www.themattsimpson.com/2012/07/19/setting-up-r-for-cfar-minicamp/</link>
		<comments>http://www.themattsimpson.com/2012/07/19/setting-up-r-for-cfar-minicamp/#comments</comments>
		<pubDate>Thu, 19 Jul 2012 22:51:54 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[CFAR]]></category>
		<category><![CDATA[minicamp]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[Rstudio]]></category>

		<guid isPermaLink="false">http://www.themattsimpson.com/?p=93</guid>
		<description><![CDATA[I'm giving a short "unconference" on doing basic statistics with R at a Center for Applied Rationality minicamp I'm attending next week, and I'll need participants to show up with R installed on their laptops along with a Rstudio. This &#8230; <a class="more-link" href="http://www.themattsimpson.com/2012/07/19/setting-up-r-for-cfar-minicamp/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>I'm giving a short "unconference" on doing basic statistics with R at a <a href="http://appliedrationality.org/" target="_blank">Center for Applied Rationality</a> minicamp I'm attending next week, and I'll need participants to show up with R installed on their laptops along with a <a href="http://rstudio.org/" target="_blank">Rstudio</a>. This is a short tutorial for participants to follow in order to get everything up and running.</p>
<p>First, go to R's <a href="http://www.r-project.org/" target="_blank">website</a>, which should look something like this:</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/MinicampPresentation/Rcran.png" alt="" /></p>
<p>On the left hand side, click on the CRAN link under "Download, Packages" next to the arrow in the image above. This will bring up a page of links to download R from, like the image below. Choose a location close to you so that the download is faster.</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/MinicampPresentation/Rcran2.png" alt="" /></p>
<p>This should bring you to a page that looks like the image below. Now click on the link to download R for whatever operating system you're running. I.e. if your computer has Windows, click on "Download R for Windows."</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/MinicampPresentation/Rcran3.png" alt="" /></p>
<p>This should bring up another page with a few more links, like the picture below. Click on "base."</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/MinicampPresentation/Rcran4.png" alt="" /></p>
<p>Finally we get to the page with the actual download link, like below. Click on "Download R 2.15.1 for X" where X is whatever operating system you have.</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/MinicampPresentation/Rcran5.png" alt="" /></p>
<p>This will download the executable to install R - a file named "R-2.15.1-win.exe" or something similar. Make sure you save it somewhere you can find it easily. When it finishes, run the file. Just follow the on-screen instructions that pop up. You shouldn't have to change anything in order for R to properly install.</p>
<p>Now you're all set to start using R... except the GUI that R comes with out of the box isn't very good. <a href="http://rstudio.org/" target="_blank">Rstudio</a> is a free IDE that improves on the base R GUI substantially. Go <a href="http://rstudio.org/download/" target="_blank">here</a> to download it. Download the version of Rstudio that their website recommends for your machine somewhere that you can easily find. Once this completes, open the file - it should be called "RStudio-0.96.316.exe" or something similar. From this point, just follow the on-screen instructions to complete the installation.</p>
<p>That's all the software you'll need for my presentation! If you have some time it might be useful to poke around Rstudio to get a general feel for it, but you certainly don't need to in order to understand my presentation. There are a few tutorials to guide your poking scattered about the web, including this <a href="http://stat.ethz.ch/consulting/Statistiksoftware/R/RStudio.pdf" target="_blank">one</a> (warning: pdf).</p>
<p>EDIT: Here are a few more resources that will be useful. First, you'll need an additional R package called arm - this package allows you to quickly fit Bayesian linear and generalized linear models. In order to install this, open up R Studio and make sure you're connected to the internet. Then type install.packages("arm") into the R console, as below.</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/MinicampPresentation/Rarm1.png" alt="" /></p>
<p>R will output the following message, or something similar:</p>
<blockquote><p>Installing package(s) into ‘C:/Users/Matt/R/win-library/2.15’<br />
(as ‘lib’ is unspecified)<br />
--- Please select a CRAN mirror for use in this session ---</p></blockquote>
<p>Wait for a few seconds, then R will give you some options for which mirror to download from. Type the number for the mirror that is closest to you to download everything faster, then press enter. See the picture below.</p>
<p><img src="http://simpsonm.public.iastate.edu/BlogPosts/MinicampPresentation/Rarm2.png" alt="" /></p>
<p>R should be pretty busy for a few minutes while it downloads and installs several packages that arm depends on as well as arm itself. Once it finishes and you see the blue ">" in the bottom of the R console, the packages are installed.</p>
<p>Futhermore, you'll need another R package call "ggplot2". You can install this using the same command: install.packages("ggplot2").</p>
<p>In addition, I've uploaded a dataset that we'll be using during my presentation. Download it <a href="http://simpsonm.public.iastate.edu/BlogPosts/MinicampPresentation/diam.csv" target="_blank">here</a>. Save it as diam.csv somewhere you can easily find it.</p>
<p>So to be 100% ready for my presentation, you need to have installed R, R Studio, and the arm and ggplot2 packages, as well as have downloaded the data file diam.csv.</p>
<p>Finally, here are a few useful resources for before, during, and after my presentation:<br />
<a href="http://simpsonm.public.iastate.edu/BlogPosts/MinicampPresentation/presentation.r" target="_blank">An R script containing my entire presentation</a>. Open with any text editor, though opening it in R Studio is best.<br />
<a href="http://cran.r-project.org/doc/contrib/Short-refcard.pdf" target="_blank">An R reference card</a>. (pdf) Print it out and tape it on the wall next to your desk. Useful for after the presentation.<br />
<a href="http://had.co.nz/ggplot2/" target="_blank">The ggplot2 website</a>. This contains useful information for making complicated and informative graphics. Useful after the presentation.<br />
<a href="http://www.public.iastate.edu/~hofmann/stat579/" target="_blank">Course website for the class I took to learn R</a>. Some overlap, but there are many new things. Useful for after the presentation.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.themattsimpson.com/2012/07/19/setting-up-r-for-cfar-minicamp/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Writing CUDA C extensions for R</title>
		<link>http://www.themattsimpson.com/2012/06/29/writing-cuda-c-extensions-for-r/</link>
		<comments>http://www.themattsimpson.com/2012/06/29/writing-cuda-c-extensions-for-r/#comments</comments>
		<pubDate>Fri, 29 Jun 2012 03:22:57 +0000</pubDate>
		<dc:creator>admin</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[cuda]]></category>
		<category><![CDATA[gpu]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://www.themattsimpson.com/?p=74</guid>
		<description><![CDATA[Here's a quick guide to writing CUDA C extensions for R to use a GPU for computation. I found another guide elsewhere, but it's a bit roundabout. This guide uses a more straightforward method that might also be a little &#8230; <a class="more-link" href="http://www.themattsimpson.com/2012/06/29/writing-cuda-c-extensions-for-r/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
				<content:encoded><![CDATA[<p>Here's a quick guide to writing CUDA C extensions for R to use a GPU for computation. I found another guide <a href="http://possiblybrainful.blogspot.com/2011/08/using-gpu-in-r-scripts.html" target="_blank">elsewhere</a>, but it's a bit roundabout. This guide uses a more straightforward method that might also be a little more complicated. </p>
<p>First, we'll look at writing a C extension and see how that generalizes. Let's say we're writing a simple program that uses C to add two numbers. If we were just writing a C program, it would look something like this:</p>

<div class="wp_codebox"><table><tr id="p7414"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
</pre></td><td class="code" id="p74code14"><pre class="c" style="font-family:monospace;"><span style="color: #339933;">#include &lt;stdio.h&gt;</span>
<span style="color: #339933;">#include &lt;stdlib.h&gt;</span>
&nbsp;
<span style="color: #993333;">void</span> add<span style="color: #009900;">&#40;</span><span style="color: #993333;">double</span> <span style="color: #339933;">*</span>a<span style="color: #339933;">,</span> <span style="color: #993333;">double</span> <span style="color: #339933;">*</span>b<span style="color: #339933;">,</span> <span style="color: #993333;">double</span> <span style="color: #339933;">*</span>c<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
  <span style="color: #339933;">*</span>c <span style="color: #339933;">=</span> <span style="color: #339933;">*</span>a <span style="color: #339933;">+</span> <span style="color: #339933;">*</span>b<span style="color: #339933;">;</span>
  Rprintf<span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;%.0f + %.0f = %.0f<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">,</span> <span style="color: #339933;">*</span>a<span style="color: #339933;">,</span> <span style="color: #339933;">*</span>b<span style="color: #339933;">,</span> <span style="color: #339933;">*</span>c<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #993333;">int</span> main<span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
  <span style="color: #993333;">double</span> <span style="color: #339933;">*</span>a<span style="color: #339933;">,</span> <span style="color: #339933;">*</span>b<span style="color: #339933;">,</span> <span style="color: #339933;">*</span>c<span style="color: #339933;">;</span>
  size_t fbytes <span style="color: #339933;">=</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">float</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  a <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #993333;">double</span> <span style="color: #339933;">*</span><span style="color: #009900;">&#41;</span> malloc<span style="color: #009900;">&#40;</span>fbytes<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  b <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #993333;">double</span> <span style="color: #339933;">*</span><span style="color: #009900;">&#41;</span> malloc<span style="color: #009900;">&#40;</span>fbytes<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  c <span style="color: #339933;">=</span> <span style="color: #009900;">&#40;</span><span style="color: #993333;">double</span> <span style="color: #339933;">*</span><span style="color: #009900;">&#41;</span> malloc<span style="color: #009900;">&#40;</span>fbytes<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #666666; font-style: italic;">//code for user input omitted</span>
&nbsp;
  add<span style="color: #009900;">&#40;</span>a<span style="color: #339933;">,</span> b<span style="color: #339933;">,</span> c<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #000066;">printf</span><span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>c = %.4f<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">,</span> <span style="color: #339933;">*</span>c<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #b1b100;">return</span> <span style="color: #0000dd;">0</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>None of this should be new to anyone who's programmed in C before. Now suppose we want to be able to call the function <code>add</code> from R. We need to change the function slightly to accommodate this:</p>

<div class="wp_codebox"><table><tr id="p7415"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
</pre></td><td class="code" id="p74code15"><pre class="c" style="font-family:monospace;"><span style="color: #339933;">#include &lt;stdio.h&gt;</span>
<span style="color: #339933;">#include &lt;stdlib.h&gt;</span>
<span style="color: #339933;">#include &lt;R.h&gt; </span>
<span style="color: #666666; font-style: italic;">//needed to interface with R</span>
&nbsp;
<span style="color: #993333;">void</span> add<span style="color: #009900;">&#40;</span><span style="color: #993333;">double</span> <span style="color: #339933;">*</span>a<span style="color: #339933;">,</span> <span style="color: #993333;">double</span> <span style="color: #339933;">*</span>b<span style="color: #339933;">,</span> <span style="color: #993333;">double</span> <span style="color: #339933;">*</span>c<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
  <span style="color: #339933;">*</span>c <span style="color: #339933;">=</span> <span style="color: #339933;">*</span>a <span style="color: #339933;">+</span> <span style="color: #339933;">*</span>b<span style="color: #339933;">;</span>
&nbsp;
  <span style="color: #666666; font-style: italic;">//similar to printf, except prints to R console</span>
  Rprintf<span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;%.0f + %.0f = %.0f<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">,</span> <span style="color: #339933;">*</span>a<span style="color: #339933;">,</span> <span style="color: #339933;">*</span>b<span style="color: #339933;">,</span> <span style="color: #339933;">*</span>c<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span> 
&nbsp;
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>There are a couple of twists here. First, we need to include the header <code>R.h</code>. This contains functions that allow R to interact with the C function, including <code>Rprintf</code>. <code>Rprintf</code> works almost exactly like <code>printf</code> except it prints directly to the R console. In addition to including the header, any function that we want to call from R has to satisfy two constraints: it must return type <code>void</code> and only have pointer arguments. Then to compile the program so that it's callable from R, we would type into the terminal</p>

<div class="wp_codebox"><table><tr id="p7416"><td class="line_numbers"><pre>1
</pre></td><td class="code" id="p74code16"><pre class="bash" style="font-family:monospace;">R CMD SHLIB foo.c</pre></td></tr></table></div>

<p>Where foo.c is the name of the file containing the code above. This command yields the following output in our terminal</p>

<div class="wp_codebox"><table><tr id="p7417"><td class="line_numbers"><pre>1
2
3
</pre></td><td class="code" id="p74code17"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">gcc</span> <span style="color: #660033;">-std</span>=gnu99 -I<span style="color: #000000; font-weight: bold;">/</span>apps<span style="color: #000000; font-weight: bold;">/</span>lib64<span style="color: #000000; font-weight: bold;">/</span>R<span style="color: #000000; font-weight: bold;">/</span>include  -I<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>include 
    <span style="color: #660033;">-fpic</span> <span style="color: #660033;">-g</span> <span style="color: #660033;">-O2</span> <span style="color: #660033;">-c</span> foo.c <span style="color: #660033;">-o</span> foo.o
<span style="color: #c20cb9; font-weight: bold;">gcc</span> <span style="color: #660033;">-std</span>=gnu99 <span style="color: #660033;">-shared</span> -L<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>lib64 <span style="color: #660033;">-o</span> foo.so foo.o</pre></td></tr></table></div>

<p>These are the commands you would have entered if you wanted to compile the code manually. We'll take a closer look at that later when we compile CUDA C code. For now though, just note that now in your working directory you have two new files: <code>foo.o</code> and <code>foo.so</code>. The latter is the file we'll use to call <code>add</code> from R. To do this, we'll need an R wrapper to call our C function transparently, e.g.:</p>

<div class="wp_codebox"><table><tr id="p7418"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
</pre></td><td class="code" id="p74code18"><pre class="rsplus" style="font-family:monospace;">add <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>a,b<span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
  <span style="color: #228B22;">##check to see if function is already loaded</span>
  <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">loaded</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;add&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">dyn.<span style="">load</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;foo.so&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
  <span style="color: #0000FF; font-weight: bold;">c</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">0</span>
  z <span style="color: #080;">&lt;-</span> .<span style="">C</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;add&quot;</span>,a<span style="color: #080;">=</span>a,b<span style="color: #080;">=</span>b,<span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">c</span> <span style="color: #080;">&lt;-</span> z$c
  <span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span></pre></td></tr></table></div>

<p>There are a couple of elements here. First, the function <code>dyn.load()</code> loads the shared library file <code>foo.so</code>. <code>is.loaded()</code>, unsurprisingly, checks to see if its argument has already been loaded. Note that we load a file but check to see if a <em>specific function</em> has already been loaded. The next element is the <code>.C</code> function. There are other ways to call C functions from R, but <code>.C</code> is the easiest. The first argument of <code>.C</code> is the name of the C function you want to call in string form. The rest of the arguments are the arguments for the function you're calling. <code>.C</code> returns a list containing the updated values of all of the arguments passed to the C function. R copies all of the arguments and passes them to C using <code>.C</code> - it does NOT simply update the value of <code>c</code> for us, so we have to copy it back from the list that <code>.C</code> returns. When we run this code, this is what we see:</p>

<div class="wp_codebox"><table><tr id="p7419"><td class="line_numbers"><pre>1
2
3
4
</pre></td><td class="code" id="p74code19"><pre class="rsplus" style="font-family:monospace;"><span style="color: #080;">&gt;</span> <span style="color: #0000FF; font-weight: bold;">source</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;foo.r&quot;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&gt;</span> add<span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>
<span style="color: #ff0000;">1</span> <span style="color: #080;">+</span> <span style="color: #ff0000;">1</span> <span style="color: #080;">=</span> <span style="color: #ff0000;">2</span>
<span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span> <span style="color: #ff0000;">2</span></pre></td></tr></table></div>

<p>This is all simple enough, but what about when we want to call a function that runs on the gpu? Assuming that we are trying to run the same function, there are a couple of tweaks. Here's the source:</p>

<div class="wp_codebox"><table><tr id="p7420"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
</pre></td><td class="code" id="p74code20"><pre class="c" style="font-family:monospace;"><span style="color: #339933;">#include &lt;stdio.h&gt;</span>
<span style="color: #339933;">#include &lt;stdlib.h&gt;</span>
<span style="color: #339933;">#include &lt;R.h&gt;</span>
&nbsp;
__global__ <span style="color: #993333;">void</span> add<span style="color: #009900;">&#40;</span><span style="color: #993333;">float</span> <span style="color: #339933;">*</span>a<span style="color: #339933;">,</span> <span style="color: #993333;">float</span> <span style="color: #339933;">*</span>b<span style="color: #339933;">,</span> <span style="color: #993333;">float</span> <span style="color: #339933;">*</span>c<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
  <span style="color: #339933;">*</span>c <span style="color: #339933;">=</span> <span style="color: #339933;">*</span>b <span style="color: #339933;">+</span> <span style="color: #339933;">*</span>a<span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">extern</span> <span style="color: #ff0000;">&quot;C&quot;</span> <span style="color: #993333;">void</span> gpuadd<span style="color: #009900;">&#40;</span><span style="color: #993333;">float</span> <span style="color: #339933;">*</span>a<span style="color: #339933;">,</span> <span style="color: #993333;">float</span> <span style="color: #339933;">*</span>b<span style="color: #339933;">,</span> <span style="color: #993333;">float</span> <span style="color: #339933;">*</span>c<span style="color: #009900;">&#41;</span><span style="color: #009900;">&#123;</span>
  <span style="color: #993333;">float</span> <span style="color: #339933;">*</span>da<span style="color: #339933;">,</span> <span style="color: #339933;">*</span>db<span style="color: #339933;">,</span> <span style="color: #339933;">*</span>dc<span style="color: #339933;">;</span>
&nbsp;
  cudaMalloc<span style="color: #009900;">&#40;</span> <span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #339933;">**</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">&amp;</span>da<span style="color: #339933;">,</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">float</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  cudaMalloc<span style="color: #009900;">&#40;</span> <span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #339933;">**</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">&amp;</span>db<span style="color: #339933;">,</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">float</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  cudaMalloc<span style="color: #009900;">&#40;</span> <span style="color: #009900;">&#40;</span><span style="color: #993333;">void</span><span style="color: #339933;">**</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">&amp;</span>dc<span style="color: #339933;">,</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">float</span><span style="color: #009900;">&#41;</span> <span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  cudaMemcpy<span style="color: #009900;">&#40;</span> da<span style="color: #339933;">,</span> a<span style="color: #339933;">,</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">float</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> cudaMemcpyHostToDevice<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  cudaMemcpy<span style="color: #009900;">&#40;</span> db<span style="color: #339933;">,</span> b<span style="color: #339933;">,</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">float</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> cudaMemcpyHostToDevice<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  add<span style="color: #339933;">&lt;&lt;&lt;</span><span style="color: #0000dd;">1</span><span style="color: #339933;">,</span><span style="color: #0000dd;">1</span><span style="color: #339933;">&gt;&gt;&gt;</span><span style="color: #009900;">&#40;</span>da<span style="color: #339933;">,</span> db<span style="color: #339933;">,</span> dc<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  cudaMemcpy<span style="color: #009900;">&#40;</span>c<span style="color: #339933;">,</span> dc<span style="color: #339933;">,</span> <span style="color: #993333;">sizeof</span><span style="color: #009900;">&#40;</span><span style="color: #993333;">float</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">,</span> cudaMemcpyDeviceToHost<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  cudaFree<span style="color: #009900;">&#40;</span>da<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  cudaFree<span style="color: #009900;">&#40;</span>db<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
  cudaFree<span style="color: #009900;">&#40;</span>dc<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
  Rprintf<span style="color: #009900;">&#40;</span><span style="color: #ff0000;">&quot;%.0f + %.0f = %.0f<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">,</span> <span style="color: #339933;">*</span>a<span style="color: #339933;">,</span> <span style="color: #339933;">*</span>b<span style="color: #339933;">,</span> <span style="color: #339933;">*</span>c<span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
<span style="color: #009900;">&#125;</span></pre></td></tr></table></div>

<p>There are two noteworthy things here. First, we still have to have a C function that allocates memory on the GPU and calls the kernel (GPU function) that we want the GPU to run - we can't directly call the kernel from R. Second, we need to make sure that R knows the name of the C function we want to call using <code>.C</code>. We do this with the <code>extern "C"</code> command. This tells the compiler to treat <code>gpuadd</code> as a C function so that in the shared library file (i.e. <code>gpufoo.so</code>) it's still called "gpuadd." Otherwise, the compiler treats the function as a C++ function and changes how its name is stored without telling us what to call the function while using <code>.C</code>. Also there's a quirk here - all of the variables are floats instead of doubles since most GPUs only support single precision floating point math. This will affect how we write our R wrapper to call <code>gpuadd</code>. If your GPU supports double precision floating point operations you can ignore that part, but keep in mind that most GPUs don't if you want to distribute your code widely. We'll assume that the code above is saved in <code>gpufoo.cu</code>.</p>
<p>So that's the code, how do we compile it? This time there isn't a simple R command we can call from the terminal but the output from compiling a C file (<code>R CMD SHLIB file.c</code>) gives us clues about how to do it manually:</p>

<div class="wp_codebox"><table><tr id="p7421"><td class="line_numbers"><pre>1
2
3
</pre></td><td class="code" id="p74code21"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">gcc</span> <span style="color: #660033;">-std</span>=gnu99 -I<span style="color: #000000; font-weight: bold;">/</span>apps<span style="color: #000000; font-weight: bold;">/</span>lib64<span style="color: #000000; font-weight: bold;">/</span>R<span style="color: #000000; font-weight: bold;">/</span>include  -I<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>include 
    <span style="color: #660033;">-fpic</span> <span style="color: #660033;">-g</span> <span style="color: #660033;">-O2</span> <span style="color: #660033;">-c</span> foo.c <span style="color: #660033;">-o</span> foo.o
<span style="color: #c20cb9; font-weight: bold;">gcc</span> <span style="color: #660033;">-std</span>=gnu99 <span style="color: #660033;">-shared</span> -L<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>lib64 <span style="color: #660033;">-o</span> foo.so foo.o</pre></td></tr></table></div>

<p>As I mentioned before, these are precisely the commands we would have used if we wanted to compile manually. I'll go through them step by step. Starting with the first line, or the compiling step, <code>gcc</code> is the compiler we call on our source code. <code>-std=gnu99</code> tells the compiler which standard of C to use. We'll ignore this option. <code>-I/...</code> tells the compiler to look in the folder <code>/...</code> for any included headers. Depending on how your system has been set up, the particular folders where R automatically looks may be different from mine. Make a note of which folders are displayed here as you'll need them later. These folders contain <code>R.h</code> among other headers. <code>-fpic</code> essentially tells the compiler to make the code suitable for a shared library - i.e. what we'll be callying with <code>dyn.load()</code> from R. <code>-g</code> is just a debugging option while <code>-O2</code> tells the compiler to optimize the code nearly as much as possible. Neither are essential for our purposes though both are useful. Finally, <code>-c</code> tells the compiler to only compile and assemble the code - i.e. don't link it, <code>foo.c</code> is the name of the source file and <code>-o foo.o</code> tells the compiler what to name the output file.</p>
<p>The next line is the linking step. Here, <code>-shared</code> tells the linker that the output will be a shared library while <code>-L/...</code> tells it where to find previously compiled libraries that this code may rely on. This is where the compiled version of R libraries may probably are at. Again use the path R outputs for you, not necessarily the path I have here. Finally,  <code>-o foo.so</code> tells the linker what to name the output file and <code>foo.o</code> is the name of the object file that needs to be linked.</p>
<p>So now we need to use these two statements to construct similar <code>nvcc</code> commands to compile our CUDA C code. The big reveal first, then the explanation:</p>

<div class="wp_codebox"><table><tr id="p7422"><td class="line_numbers"><pre>1
2
3
</pre></td><td class="code" id="p74code22"><pre class="bash" style="font-family:monospace;">nvcc <span style="color: #660033;">-g</span> <span style="color: #660033;">-G</span> <span style="color: #660033;">-O2</span> -I<span style="color: #000000; font-weight: bold;">/</span>apps<span style="color: #000000; font-weight: bold;">/</span>lib64<span style="color: #000000; font-weight: bold;">/</span>R<span style="color: #000000; font-weight: bold;">/</span>include -I<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>include 
     <span style="color: #660033;">-Xcompiler</span> <span style="color: #ff0000;">&quot;-fpic&quot;</span> <span style="color: #660033;">-c</span> gpufoo.cu gpufoo.o
nvcc <span style="color: #660033;">-shared</span> -L<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>local<span style="color: #000000; font-weight: bold;">/</span>lib64 gpufoo.o <span style="color: #660033;">-o</span> gpufoo.so</pre></td></tr></table></div>

<p><code>-g</code> and <code>-O2</code> do the same thing here as before with the caveat that they only apply to code that runs on the host (i.e. not on the GPU). That is to say <code>-g</code> generates debugging information for only the host code and <code>-O2</code> optimizes only the host code. <code>-G</code>, on the other hand, generates debugging information for the <em>device</em> code, i.e. the code that runs on the GPU. So far, nothing different from before. The wrinkle is in this component: <code>-Xcompiler "-fpic"</code>. This tells the compiler to pass on the arguments in quotes to the C compiler, i.e. to <code>gcc</code>. This argument is exactly the same as above, as is the rest of the arguments outside of the quotes. The link step is basically identical to before as well.</p>
<p>In reality, there's no need for some of the commands in both the compile and link step. An alternative would be</p>

<div class="wp_codebox"><table><tr id="p7423"><td class="line_numbers"><pre>1
2
3
</pre></td><td class="code" id="p74code23"><pre class="bash" style="font-family:monospace;">nvcc <span style="color: #660033;">-g</span> <span style="color: #660033;">-G</span> <span style="color: #660033;">-O2</span> -I<span style="color: #000000; font-weight: bold;">/</span>apps<span style="color: #000000; font-weight: bold;">/</span>lib64<span style="color: #000000; font-weight: bold;">/</span>R<span style="color: #000000; font-weight: bold;">/</span>include <span style="color: #660033;">-Xcompiler</span> 
     <span style="color: #ff0000;">&quot;-Wall -Wextra -fpic&quot;</span> <span style="color: #660033;">-c</span> gpufoo.cu gpufoo.o
nvcc <span style="color: #660033;">-shared</span> <span style="color: #660033;">-lm</span> gpufoo.o <span style="color: #660033;">-o</span> gpufoo.so</pre></td></tr></table></div>

<p>This version removes a path from both the compile and the link step because nothing in those folders is relevant to compiling the above program - at least on my system. In the compile step I added two arguments to be passed to the compiler: <code>-Wall</code> and <code>-Wextra</code>. These tell the compiler to show warnings for things in our code that commonly cause errors - very useful for preventing bugs. Finally in the link step I added the command <code>-lm</code>. In general, the command <code>-lname</code> links the library named "name." In this case, it links the math library which we would be using if we had <code>#include<br />
<math.h></code> in our source file. If, for example, we were using NVIDIA's CUBLAS library we would need <code>-lcublas</code> here.</p>
<p>Now in order to call this function from R, our wrapper needs to be slightly different:</p>

<div class="wp_codebox"><table><tr id="p7424"><td class="line_numbers"><pre>1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
</pre></td><td class="code" id="p74code24"><pre class="rsplus" style="font-family:monospace;">gpuadd <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">function</span><span style="color: #080;">&#40;</span>a,b<span style="color: #080;">&#41;</span><span style="color: #080;">&#123;</span>
  <span style="color: #0000FF; font-weight: bold;">if</span><span style="color: #080;">&#40;</span><span style="color: #080;">!</span><span style="color: #0000FF; font-weight: bold;">is.<span style="">loaded</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;gpuadd&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
    <span style="color: #0000FF; font-weight: bold;">dyn.<span style="">load</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;gpufoo.so&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
  <span style="color: #228B22;">##tell R to convert to single precision before copying to .C</span>
  <span style="color: #0000FF; font-weight: bold;">c</span> <span style="color: #080;">&lt;-</span> <span style="color: #0000FF; font-weight: bold;">single</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">mode</span><span style="color: #080;">&#40;</span>a<span style="color: #080;">&#41;</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">&quot;single&quot;</span>
  <span style="color: #0000FF; font-weight: bold;">mode</span><span style="color: #080;">&#40;</span>b<span style="color: #080;">&#41;</span> <span style="color: #080;">&lt;-</span> <span style="color: #ff0000;">&quot;single&quot;</span>
&nbsp;
  z <span style="color: #080;">&lt;-</span> .<span style="">C</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;gpuadd&quot;</span>,a,b,<span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#41;</span>
  <span style="color: #0000FF; font-weight: bold;">c</span> <span style="color: #080;">&lt;-</span> z$c
&nbsp;
  <span style="color: #228B22;">##Change to standard numeric to avoid dangerous side effects</span>
  <span style="color: #0000FF; font-weight: bold;">attr</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span>,<span style="color: #ff0000;">&quot;Csingle&quot;</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&lt;-</span> NULL 
  <span style="color: #0000FF; font-weight: bold;">return</span><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">c</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span></pre></td></tr></table></div>

<p>The essential difference here is that <code>.C</code>'s arguments need to be have the "Csingle" attribute so that it knows to copy them as floats instead of doubles. The same applies for integer arguments. Finally, this attribute needs to be removed before returning the result to avoid some dangerous side effects - which occurs right before the function returns its output.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.themattsimpson.com/2012/06/29/writing-cuda-c-extensions-for-r/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
