adding random variables

adding RVs
library(ggplot2)
set.seed(87349)

adding UNIF

First, simulate several \(UNIF(0,1)\) RVs, also sums of these RVs, and store in dataframe.

uniforms <- data.frame(U=runif(100000), V=runif(100000), W=runif(100000), 
                       X=runif(100000), Y=runif(100000), Z=runif(100000))
uniforms$two <- uniforms$U + uniforms$V
uniforms$three <- uniforms$U + uniforms$V + uniforms$W
uniforms$four <- uniforms$U + uniforms$V + uniforms$W + uniforms$X
uniforms$five <- uniforms$U + uniforms$V + uniforms$W + uniforms$X + uniforms$Y
head(uniforms)
##           U           V          W         X           Y          Z
## 1 0.2237303 0.212718492 0.21313572 0.8592435 0.230055725 0.39315550
## 2 0.6732303 0.049370517 0.09932315 0.5116051 0.323103104 0.51446665
## 3 0.1352415 0.004280132 0.06291041 0.2264902 0.005689052 0.61851831
## 4 0.8180958 0.846841856 0.58342319 0.5776902 0.170271991 0.04613788
## 5 0.4162543 0.663753846 0.88186666 0.1426889 0.240065767 0.08535646
## 6 0.5642494 0.099078418 0.60979669 0.2343633 0.838472871 0.17989737
##         two     three      four      five
## 1 0.4364488 0.6495845 1.5088279 1.7388837
## 2 0.7226008 0.8219240 1.3335291 1.6566322
## 3 0.1395216 0.2024320 0.4289222 0.4346113
## 4 1.6649377 2.2483608 2.8260510 2.9963230
## 5 1.0800081 1.9618748 2.1045637 2.3446294
## 6 0.6633279 1.2731245 1.5074878 2.3459607

Each pair of graphs is:
Left side - histogram of RV Right side - density of simulated RV in blue, Normal RV with same mean and sd as simulated RV in orange.

single \(UNIF(0,1)\)

ggplot(uniforms, aes(Z)) + geom_histogram(boundary=0) + theme_minimal()
ggplot(uniforms, aes(Z)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=mean(uniforms$Z), 
                                     sd=sd(uniforms$Z)), color="orange", size=1) + 
  theme_minimal()

two \(UNIF(0,1)\)

ggplot(uniforms, aes(two)) + geom_histogram() + theme_minimal()
ggplot(uniforms, aes(two)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=mean(uniforms$two), 
                                     sd=sd(uniforms$two)), color="orange", size=1) + 
  theme_minimal()

three \(UNIF(0,1)\)

ggplot(uniforms, aes(three)) + geom_histogram() + theme_minimal()
ggplot(uniforms, aes(three)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=mean(uniforms$three), 
                                     sd=sd(uniforms$three)), color="orange", size=1) + 
  theme_minimal()

adding three \(UNIF(0,1)\) doesn’t look as much like a parabola, as it does a Normal, at least by eye…

four \(UNIF(0,1)\)

ggplot(uniforms, aes(four)) + geom_histogram() + theme_minimal()
ggplot(uniforms, aes(four)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=mean(uniforms$four), 
                                     sd=sd(uniforms$four)), color="orange", size=1) + 
  theme_minimal()

five \(UNIF(0,1)\)

ggplot(uniforms, aes(five)) + geom_histogram() + theme_minimal()
ggplot(uniforms, aes(five)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=mean(uniforms$five), 
                                     sd=sd(uniforms$five)), color="orange", size=1) + 
  theme_minimal()

as you add more \(UNIF(0,1)\) RVs together it looks more and more Normal.

adding Normals

standard Normals (0,1)

First, simulate several \(N(0,1)\) RVs, also sum of these RVs, and store in dataframe.

norms <- data.frame(U=rnorm(100000), V=rnorm(100000), W=rnorm(100000), 
                       X=rnorm(100000), Y=rnorm(100000), Z=rnorm(100000))
norms$two <- norms$U + norms$V
norms$three <- norms$U + norms$V + norms$W
norms$four <- norms$U + norms$V + norms$W + norms$X
norms$five <- norms$U + norms$V + norms$W + norms$X + norms$Y
head(norms)
##            U          V          W          X           Y            Z
## 1  0.9581680  0.1310740 -1.3868298 -2.8754473  1.00020538  0.006235532
## 2  0.5813510 -0.1671726  0.2535913 -1.8112824 -0.02783789 -0.157884604
## 3  1.1681468 -1.3054014  0.5745714  1.1506819  2.65099417  1.064256780
## 4  0.6596708 -0.1528759 -1.3677272 -0.8540412 -1.64139015  1.436712707
## 5 -0.7457130 -1.5081185 -1.4254729  0.1393246 -1.53245196 -2.052007159
## 6  0.2144402  0.3832123  1.5384686 -0.7112149  0.61895148  1.614482291
##          two      three      four      five
## 1  1.0892420 -0.2975879 -3.173035 -2.172830
## 2  0.4141783  0.6677696 -1.143513 -1.171351
## 3 -0.1372546  0.4373167  1.587999  4.238993
## 4  0.5067949 -0.8609323 -1.714974 -3.356364
## 5 -2.2538314 -3.6793043 -3.539980 -5.072432
## 6  0.5976525  2.1361211  1.424906  2.043858

Each pair of graphs is:
Left side - histogram of RV Right side - density of simulated RV in blue, Normal RV with same mean and sd as simulated RV in orange.

one \(N(0,1)\)

ggplot(norms, aes(Z)) + geom_histogram() + theme_minimal()
ggplot(norms, aes(Z)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=1), color="orange", size=1) + 
  theme_minimal()

two \(N(0,1)\)

ggplot(norms, aes(two)) + geom_histogram() + theme_minimal()
ggplot(norms, aes(two)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=sqrt(2)), color="orange", size=1) + 
  theme_minimal()

three \(N(0,1)\)

ggplot(norms, aes(three)) + geom_histogram() + theme_minimal()
ggplot(norms, aes(three)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=sqrt(3)), color="orange", size=1) + 
  theme_minimal()

four \(N(0,1)\)

ggplot(norms, aes(four)) + geom_histogram() + theme_minimal()
ggplot(norms, aes(four)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=sqrt(4)), color="orange", size=1) + 
  theme_minimal()

five \(N(0,1)\)

ggplot(norms, aes(five)) + geom_histogram() + theme_minimal()
ggplot(norms, aes(five)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=sqrt(5)), color="orange", size=1) + 
  theme_minimal()

1 \(N(0,1)\) in grey, 2 \(N(0,1)\) in green, 3 \(N(0,1)\) in orange, 4 \(N(0,1)\) in blue, 5 \(N(0,1)\) in red all plotted on same axes.

ggplot(norms) + geom_density(aes(Z), color="grey20", size=1) +
  geom_density(aes(two), color="green", size=1) +
  geom_density(aes(three), color="orange", size=1) +
  geom_density(aes(four), color="blue", size=1) +
  geom_density(aes(five), color="red", size=1) + 
  xlab("simulated value") + xlim(c(-12,12)) + theme_minimal()

Normals other than standard

simulate some different Normals:
A is \(N(1,1)\)
B is \(N(0,2)\)
[note: I’m using \(N( \mu, sd)\) since that is what R uses.]

also making sums of multiple iid A like RVs (\(N(1,1)\)) and sums of multiple iid B like RVs (\(N(0,2)\)).

norms$A <- rnorm(100000, mean=1, sd=1)
norms$B <- rnorm(100000, mean=0, sd=2)
norms$Atwo <- norms$A + rnorm(100000, mean=1, sd=1) 
norms$Athree <- norms$Atwo + rnorm(100000, mean=1, sd=1)
norms$Afour <- norms$Athree + rnorm(100000, mean=1, sd=1)
norms$Afive <- norms$Afour + rnorm(100000, mean=1, sd=1)
norms$Btwo <- norms$B + rnorm(100000, mean=0, sd=2) 
norms$Bthree <- norms$Btwo + rnorm(100000, mean=0, sd=2)
norms$Bfour <- norms$Bthree + rnorm(100000, mean=0, sd=2)
norms$Bfive <- norms$Bfour + rnorm(100000, mean=0, sd=2)
head(norms[11:20])
##            A          B       Atwo     Athree    Afour    Afive       Btwo
## 1 -1.0438201 -4.8441826  0.8650604  1.3168871 2.318044 2.741057 -3.0955626
## 2  1.1912393  0.9785544  2.5942400  5.6255168 7.080638 9.948361 -0.1011114
## 3  1.9041271 -1.0798495  4.0626890  2.8878648 4.379048 5.087002 -5.1020909
## 4  0.5565502 -0.3527550  2.6218529  4.6465501 4.608646 4.213038 -2.8222264
## 5 -1.8627518  2.8063106 -0.4423007 -0.5511588 1.139664 1.968801  5.8245019
## 6  0.8872037 -5.0739639  1.5333508  4.2351763 5.117495 7.656440 -3.2865228
##      Bthree     Bfour     Bfive
## 1 -1.062202 -2.798807 -4.693949
## 2 -2.637423 -6.391763 -4.758033
## 3 -3.980746 -5.641978 -8.529242
## 4 -4.274471 -3.920905 -1.883908
## 5  9.955951  7.319866  6.562164
## 6 -5.537575 -5.019152 -4.418313

Each pair of graphs is:
Left side - histogram of RV Right side - density of simulated RV in blue, Normal RV with same mean and sd as simulated RV in orange.

one A (\(N(1,1)\))

ggplot(norms, aes(A)) + geom_histogram() + theme_minimal()
ggplot(norms, aes(A)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=1, 
                                     sd=1), color="orange", size=1) + 
  theme_minimal()

five As (\(N(1,1)\))

ggplot(norms, aes(Afive)) + geom_histogram() + theme_minimal()
ggplot(norms, aes(Afive)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=5, 
                                     sd=sqrt(5)), color="orange", size=1) + 
  theme_minimal()

one B (\(N(0,2)\))

ggplot(norms, aes(B)) + geom_histogram() + theme_minimal()
ggplot(norms, aes(B)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=2), color="orange", size=1) + 
  theme_minimal()

five Bs (\(N(0,2)\))

ggplot(norms, aes(Bfive)) + geom_histogram() + theme_minimal()
ggplot(norms, aes(Bfive)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=sqrt(20)), color="orange", size=1) + 
  theme_minimal()

Comparing A (\(N(1,1)\)) in blue and B (\(N(0,2)\)) in green B, \(N(0,1)\) in grey.

ggplot(norms) + geom_density(aes(A), color="blue", size=1) + 
  geom_density(aes(B), color="green", size=1) + 
  stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=1), color="grey70", size=1, linetype=5) + 
  theme_minimal() + xlab("simulated value")

Comparing one A (dashed) with 2 As in blue and one B (dashed) with 2 Bs in green B. \(N(0,1)\) in dashed grey in each.

ggplot(norms) + stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=1), color="grey70", size=1, linetype=5) +
  geom_density(aes(Atwo), color="blue", size=1) + 
  geom_density(aes(A), color=alpha("blue",0.6), size=1, linetype=2) +
  stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=1), color="grey70", size=1, linetype=5) + 
  theme_minimal() + labs(x="simulated value", y="density")

ggplot(norms) + stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=1), color="grey70", size=1, linetype=5) +
  geom_density(aes(Btwo), color="green", size=1) + 
  geom_density(aes(B), color=alpha("green",0.6), size=1, linetype=2) + 
  theme_minimal() + labs(x="simulated value", y="density")

Comparing one A (dashed) with sums of As (2,3,4,5) in blue, \(N(0,1)\) in dashed grey.

ggplot(norms) + stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=1), color="grey70", size=1, linetype=5) +
  geom_density(aes(A), color="#0000ff", size=1, linetype=2) +
  geom_density(aes(Atwo), color="#3333ff", size=1) +
  geom_density(aes(Athree), color="#6666ff", size=1) +
  geom_density(aes(Afour), color="#9999ff", size=1) +
  geom_density(aes(Afive), color="#ccccff", size=1) +
  theme_minimal() + labs(x="simulated value", y="density")

Comparing one B (dashed) with sums of Bs (2,3,4,5) in green, \(N(0,1)\) in dashed grey.

ggplot(norms) + stat_function(fun=dnorm, args=list(mean=0, 
                                     sd=1), color="grey70", size=1, linetype=5) +
  geom_density(aes(B), color="#008000", size=1, linetype=2) + 
  geom_density(aes(Btwo), color="#00b300", size=1) + 
  geom_density(aes(Bthree), color="#00e600", size=1) + 
  geom_density(aes(Bfour), color="#1aff1a", size=1) + 
  geom_density(aes(Bfive), color="#4dff4d", size=1) + 
  theme_minimal()  + labs(x="simulated value", y="density")

Now add three different Normals:
\(N(0,1)\) + \(N(1,1)\) + \(N(0,2)\)

norms$ZAB <- norms$Z + norms$A + norms$B

plot Z+A+B which should be \(N(1,\sqrt6)\) since:
\(\mu=0+0+1\) and,
\(sd=\sqrt{1^2+1^2+2^2}=\sqrt6\)
histogram on the left, on the right is the density of ZAB in blue and \(N(1,\sqrt6)\) in orange:

ggplot(norms, aes(ZAB)) + geom_histogram() + theme_minimal()
ggplot(norms, aes(ZAB)) + geom_density(color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=1, 
                                     sd=sqrt(6)), color="orange", size=1) + 
  theme_minimal()

Overlay of Z (\(N(0,1)\)) in orange, A (\(N(1,1)\)) in green, and B (\(N(0,2)\)) in red, with the result of Z+A+B (\(N(1,\sqrt6)\)) in blue and theoretical \(N(1,\sqrt6)\) in grey dots.

ggplot(norms) + geom_density(aes(A), color="#99ff99", size=1) +  
  geom_density(aes(B), color="#ff8080", size=1) + 
  geom_density(aes(Z), color="#ffd280", size=1) + 
  geom_density(aes(ZAB), color="blue", size=1) + 
  stat_function(fun=dnorm, args=list(mean=1, 
                                     sd=sqrt(6)), color="grey70", size=1, linetype=3) +
  theme_minimal() + xlab("simulated value")