Task2.knit

TASK 2

1.

The density of \((X_i, Y_i)^T\) can be expressed as \(f(x, y)=c\mathbb{1}\lbrace 0 < x < y < 1 \rbrace\), where \(c\in \mathbb{R}.\) It is easy to see that \(c = 2\) and hence \(f(x, y)=2\,\mathbb{1}\lbrace 0 < x < y < 1 \rbrace.\)

The marginal densities are: \(f_X(x)=\int_{x}^{1} f(x, y) \,dy =2(1-x)\mathbb{1}_{\lbrace x \in (0;1)\rbrace }\), \(f_Y(y)=\int_{0}^{y} f(x, y) \,dx = 2y\mathbb{1}_{\lbrace y \in (0;1)\rbrace }.\)

The elements of a random vector \((X, Y)^T\) are independent if, and only if the marginal and joint densities satisfy \(f_X(x)\,f_Y(y)=f(x, y).\) Since \(2(1-x)\mathbb{1}_{\lbrace x \in (0;1)\rbrace } 2y\mathbb{1}_{\lbrace y \in (0;1)\rbrace } \neq 2\,\mathbb{1}\lbrace 0 < x < y < 1 \rbrace,\) the random variables \(X,\) \(Y\) are not independent.

2.

The conditional denisty can be calculated as \(f_{Y \vert X} = \frac{f_{X,Y}(x,y)}{f_X(x)}=\frac{2\,\mathbb{1}_{\lbrace 0 < x < y < 1 \rbrace}}{2(1-x)\mathbb{1}_{\lbrace x \in (0;1)\rbrace }}=\frac{\,\mathbb{1}_{\lbrace 0 < x < y < 1 \rbrace}}{1-x}.\) From the relationship \(f(x, y) = f_{Y \vert X}(x)\,f_X(y)\) we are able to generate a sample from the distribution with the density \(f(x, y)\) using quantile functions \(F^{-1}_{Y \vert X}\), \(F^{-1}_{X}\) and random samples from the uniform distribution on \((0;\,1).\)

It holds: \(F^{-1}_{X} ( z ) =1-\sqrt{1-z},\,\) \(F^{-1}_{Y \vert X} ( z ) = z (1-x)+x\) for \(z \in (0;\,1).\)

Idea: We will generate a random sample \(Z_1,\dots,Z_n\) from uniform distribution such that \(Z_i = F_{X} (X_i).\) Then by transformation \(F^{-1}_{X} (Z_i) = X_i\) for all \(i = 1,\dots,n.\) Analogously for \(Y \vert X.\)

A sample of size 1000 is shown on the following scatterplot.

n <- 1000
set.seed(2005)
Z <- runif(n[1])
X=1-(1-Z)^(1/2)
U <- runif(n[1])
Y=(1-X)*U+X
plot(X, Y)

Based on the random sample we can calculate estimates of the mean and the covariance matrix.

The sample mean for our data is \[\begin{gather} \begin{pmatrix} \overline{X_{1000}} \\ \overline{Y_{1000}} \end{pmatrix} = \begin{pmatrix} 0.33\\ 0.68 \end{pmatrix} \end{gather}\] and the covariance matrix estimate is \[\begin{gather} \hat{\Sigma}_{1000} = \begin{pmatrix} 0.05 & 0.02\\ 0.02 & 0.05 \end{pmatrix} \end{gather}\]

Let us verify using graphical tools that the random sample indeed follows the given joint distribution. For that 3D histograms and their 2D projections can be used. The following graphics for n = 400, 1000 and 10000 demonstrate that with increasing sample size we can observe more uniform data in the upper left triangle, as is to be expected for a sample from a uniform distribution.

library(plot3D)
n <- 400
set.seed(2005)
Z <- runif(n[1])
X=1-(1-Z)^(1/2)
U <- runif(n[1])
Y=(1-X)*U+X
x_c <- cut(X, 12)
y_c <- cut(Y, 12)

##  Calculate joint counts at cut levels:
z <- table(x_c, y_c)

##  Plot as a 3D histogram:
hist3D(z=z, border="black", main = "n=400")

##  Plot as a 2D heatmap:
image2D(z=z, border="black", main = "n=400")

library(plot3D)
n <- 1000
set.seed(2005)
Z <- runif(n[1])
X=1-(1-Z)^(1/2)
U <- runif(n[1])
Y=(1-X)*U+X
x_c <- cut(X, 12)
y_c <- cut(Y, 12)

##  Calculate joint counts at cut levels:
z <- table(x_c, y_c)

##  Plot as a 3D histogram:
hist3D(z=z, border="black", main = "n=1000")

##  Plot as a 2D heatmap:
image2D(z=z, border="black", main = "n=1000")

library(plot3D)
n <- 10000
set.seed(2005)
Z <- runif(n[1])
X=1-(1-Z)^(1/2)
U <- runif(n[1])
Y=(1-X)*U+X
x_c <- cut(X, 12)
y_c <- cut(Y, 12)

##  Calculate joint counts at cut levels:
z <- table(x_c, y_c)

##  Plot as a 3D histogram:
hist3D(z=z, border="black", main = "n=10000")

##  Plot as a 2D heatmap:
image2D(z=z, border="black", main = "n=10000")