Definition:

  • Normal distribution
  • A –dimensional normal density for the random vector has the form
    • as moves, the maximum point moves accordingly
    • therefore, is a symmetric matrix with
    • by definition of Correlation
  • Contours of constant density for the -dimensional normal distribution are ellipsoids defined by such the that
    • These ellipsoids are centered at and have axes
    • where are Eigenvector. Eigenvalue for
    • In bivariate, these constant density is an ellipsoids formed when a constant and are the basis of that plane, ig?
  • The solid ellipsoid of values satisfying has probability
    • The -variate normal density has a maximum value when the squared distance is zero, that is, when . Thus, is the point of maximum density, or mode, as well as the expected value of , or mean.
    • The fact that is the mean of the multivariate normal distribution follows from the symmetry exhibited by the constant-density contours: These contours are centered, or balanced, at
  • The following are true for a random vector having a multivariate normal distribution:
    1. Linear combinations of the components of are normally distributed.
    2. All subsets of the components of X have a (multivariate) normal distribution.
    3. Zero covariance implies that the corresponding components are independently distributed.
    4. The conditional distributions of the components are (multivariate) normal

Proposition:

  • If is a positive definite, so that exists, then implies so are Eigenvalue-Eigenvector pair for corresponding to the pair for
    • also, is positive definite
  • Linear combination variable of component of X: If is distributed as , then any linear combination of variables is distributed as .
    • Also, if is distributed as for every , then must be
  • Matrix tranformation variable: If is distributed as , the linear combinations (each equation reprensents 1 row of ) are distributed as .
      • This produce new variables then form a vector
    • Also, where is a vector of constants is distrbuted as
  • All subsets of are normally distributed. If we respectively partition by elements, its mean vector and its covariance matrix as
    • and
    • then is distributed as
  • Dependency between combination elements as variable:
    • If and are independent, then , a matrix of zeros
    • If partition is then and are independent if and only if
    • If and are independent and are distributed as and respectively, then has multivariate normal distribution
  • Conditional expectation: Let be distributed as with and . Then the conditional distribution of given that , is normal and has:
    • mean =
    • variance =
    • note that the covariance does not depend on the value of the conditioning variable
  • Let be distributed as with . Then:
    • is distributed as , Chi-squared distribution with df
    • The distribution assigns probability to the solid ellipsoid where denotes upper -th percentile of distribution
  • Let be mutually independent with distributed as , then is distributed as
    • Moreover, ad are jointly multivariate normal with covariance matrix
    • consequently, and are independent if

Multivariate Normal Likelihood:

  • Assume that the vectors represent a random sample from a multivariate normal population with mean vector and covariance matrix
  • The joint density function of all observations is product of the marginal normal density:
  • A function of and for the fixed set of observations is called the likelihood
  • Lemma: Let be a symmetric matrix and be a vector:
    • , Trace
  • With that, we can have joint density function

Maximum likelihood estimation of and :

  • Lemma: Given a symmetric positive definite matrix and scalar , it follows that for all positive definite matrix , with equality holding only for
  • Proposition: Let be a random sample from a normal population with mean and covariance .
    • Then and are maximum likehood estimators of
    • Their observed values, and are maximum likelihood estimates of and
  • Maximum likelihood estimators possess an invariance property
    • Let be the maximum likelihood estimator of and consider estimating the parameter Then the maximum likelihood estimate of is
  • Let be a random sample from a multivariate normal population with mean and covariance , then and are sufficient statistics

The sampling distribution of and :

  • Similar to Central limit theorem
  • For the multivariate case, has a normal distribution with mean and covariance matrix
  • The sampling distribution of the sample covariance matrix is called the Wishart distribution, defined as the sum of independent products of multivariate normal random vectors
  • Wishart distribution with df = distribution of where are each independently distributed as
    • denote with
  • theorem Let be a random sample of size from a -variate normal distribution with mean and covariance matrix , then:
    • and are independent
  • Properties of Wishart distribution:
    • if is distributed as as independently of then is distributed as )
    • If then

Large-sample be haviour of and

  • Let be independent observations from any population with mean and finite covariance .
  • Then has an approximate distribution for large sample sizes.
    • Hence should also be large relative to .
  • In addition is approximately for is large
    • is dimension of the space

Evaluating multivariate normality:

  • Similar to Testing univariate normal distribution but using mean as and covariance matrix
  • Use squared generalized distances
    • applies for all variables with dimension
  • When the parent population is multivariate normal and both and are greater than 30, each of the squared distances should behave like a chi-square random variable.
  • Although these distances are not independent or exactly chi-square distributed, it is helpful to plot them as if they were. The resulting plot is called a chi-square plot.
  • Steps:
    • find by finding all
    • find by finding all:
      • , rmb sample variance
      • , rmb sample covariance
    • Construct chi-square plot:
      • order from smallest to largest
      • Graph the pairs
        • where is the quantile of the chi-square distribution with degrees of freedom
        • use chi square (=CHISQ.INV.RT((n-j+0.5)/n,2))
          • note that j=1,2,…,n
    • Find the Correlation Coefficient of the line, it is our test statistics
    • Compare it against Chi-squared distribution with n dof, reject if