# PCA for interviews-Eigenvectors/Langrangian

PCA, The math behind constrained optimisations , drawbacks and other important discussions

One of the most important mathematical concept required to understand the working of PCA is the concept of Eigenvectors and eigenvalues . If you know the algorithm , you must be aware that in order to reduce the dimensions of a certain dataset with dimension d to a smaller dimension k , PCA works by calculating the eigenvectors/values of the covariance matrix of the data set and select the top k values in descending order.

But why ? Why does the concept of eigenvalues and eigenvectors , gets involved during reduction of dimensions.

## What are Eigenvectors/eigenvalues?

For any square matrix A , its Eigenvectors and eigenvalues have the following relation:

Geometrically speaking , one can see that ,for given eigenvector-value pair , the matrix multiplication is equal to a mere scaling of the eigenvector by a factor lambda . Hence its direction is unchanged .

Eigenvalues and eigenvectors are only for square matricesEigenvectors are by definition nonzero. Eigenvalues may be equal to zero.

Now lets look at another concept used in PCA

## Constrained Optimisation

Sometimes , when one needs to find maxima/minima of any expression which in turn is following a certain constraint , you cannot equate its first derivative and equate it to zero. You must take care of the constraint given .

Such constrained optimisation problems are solved using the concept of LANGRANGE MULTIPLIERS.

Note: Langrange multipliers are always positive .

## The optimisation problem in PCA

Now lets see how the 2 concepts , Eigenvector/values and Langrangian multipliers are required to solve the PCA optimisation problem . Recall that how PCA is all about finding axes with maximum variance . Below is the optimisation equation of PCA, where S is the covariance matrix of the dataset , hence a d*d dimension square matrix (where d is the original dimension of the dataset) and u is the direction along which the variance has to be maximized.