5 Data Analysis and M L

Data analysis

In this lecture, Dr. Buscher tries to summarise half of a Part IA compsci course in one hour.

$\chi$-squared

maximises likelyhood of data given model
Bayes: $P(model | data) P(data) = P(data | model) P(model)$
$P(data)$ is fixed and assume the “Prior” $P(model)$ is uniform, so least-squares (minimising $\chi^2$) does maximise Posterior probability $P(model|data)$

Moore-Penrose inverse

a robust algorithm that generates a pseudoinverse even if there is no unique soluion

Singular value decmoposition

# compute the pseudoinverse of matrix A
Ainv = np.linalg.pinv(A) 
# matrix multiplication by '@' sign
theta = Ainv @ y
# singular value decomposition
u, s, vh = np.linalg.svd

Pandas

Local optimum: Gradient descent

Levenberg-Marquardt (for least squares)

Conjugate gradient

help(scipy.optimise) # not the easiest thing to read in command line or jupyter. better google

Global maximum: Markov-chain Monte-carlo method

Machine-Learning

Universal approximation theorem: a neural network with 1 hidden layer is enough to approximate any function