5 Data Analysis and M L

Data analysis

In this lecture, Dr. Buscher tries to summarise half of a Part IA compsci course in one hour.


  • maximises likelyhood of data given model
  • Bayes: $P(model | data) P(data) = P(data | model) P(model)$
  • $P(data)$ is fixed and assume the “Prior” $P(model)$ is uniform, so least-squares (minimising $\chi^2$) does maximise Posterior probability $P(model|data)$

Moore-Penrose inverse

  • a robust algorithm that generates a pseudoinverse even if there is no unique soluion
  • Singular value decmoposition
    # compute the pseudoinverse of matrix A
    Ainv = np.linalg.pinv(A) 
    # matrix multiplication by '@' sign
    theta = Ainv @ y
    # singular value decomposition
    u, s, vh = np.linalg.svd


Local optimum: Gradient descent

Levenberg-Marquardt (for least squares)

Conjugate gradient

help(scipy.optimise) # not the easiest thing to read in command line or jupyter. better google

Global maximum: Markov-chain Monte-carlo method


Universal approximation theorem: a neural network with 1 hidden layer is enough to approximate any function