# Inspecting word2vec Matrices

A detailed look inside the weight matrices of the word2vec model. Both CBOW and SkipGram models are discussed

# N-gram Language Models

The concept of language models as a sequence of words is explored via probabilistic interpretations.

# Maximum Likelihood Estimation – finding the best Parametric Model

An intuitive explanation of what likelihood is and how maximizing it gives the best parametric model that fits the data. Few examples are also worked out in detail

# Gradient Descent for a Single Artificial Neuron

This article explore the derivation of the gradient descent algorithm for a single artificial neuron for the logistic activation function. Few properties of the logistic function are also discussed.

# Primer on Lambda Calculus

The basic concepts of Lambda Calculus are discussed including functions, applications, the scope of free and bound variables and the order of evaluation. Few examples are worked out.

# Hidden Markov Models Training – The Forward-Backward Algorithm

This article discusses the problem of learning the HMM parameters given an observation sequence and the set of possible states. The concept of backward Probability is defined and an iterative algorithm for learning is presented. Derivations and diagrams are sketched out.

# Hidden Markov Models Decoding – The Viterbi Algorithm

This article discusses the problem of decoding – finding the most probable sequence of states that produced a sequence of observations. A dynamic programming approach is presented. Derivations and diagrams are sketched out and time complexity is analyzed.

# Short Primer on Probability

A primer on probability explaining the concepts of random variables, joint probabilities, marginalization, conditional probabilities, Bayes Rule, probabilistic inference and conditional independence. Examples and formulae included.