{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "import math  # Just ignore this :-)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ML E2020 - Week 10 - Practical Exercises"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "22a39008-e542-438a-83f4-fd54f37017b8"
    }
   },
   "source": [
    "In the exercise below, you will see an example of how a hidden Markov model (HMM)\n",
    "can be represented, and you will implement and experiment with the computation of the joint probability and various decodings as explained in the lectures in week 9."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "13722c25-cdc1-4ea4-98f0-a779946ce53c"
    }
   },
   "source": [
    "# 1 - Representing an HMM\n",
    "\n",
    "We can represent a HMM as a triple consisting of three matrices: a $K \\times 1$ matrix with the initial state probabilities, a $K \\times K$ matrix with the transition probabilities and a $K \\times |\\Sigma|$ matrix with the emission probabilities. In Python we can write the matrices like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": true,
    "nbpresent": {
     "id": "b9f24259-216a-438c-86e7-aad8cb275bc6"
    }
   },
   "outputs": [],
   "source": [
    "init_probs_7_state = [0.00, 0.00, 0.00, 1.00, 0.00, 0.00, 0.00]\n",
    "\n",
    "trans_probs_7_state = [\n",
    "    [0.00, 0.00, 0.90, 0.10, 0.00, 0.00, 0.00],\n",
    "    [1.00, 0.00, 0.00, 0.00, 0.00, 0.00, 0.00],\n",
    "    [0.00, 1.00, 0.00, 0.00, 0.00, 0.00, 0.00],\n",
    "    [0.00, 0.00, 0.05, 0.90, 0.05, 0.00, 0.00],\n",
    "    [0.00, 0.00, 0.00, 0.00, 0.00, 1.00, 0.00],\n",
    "    [0.00, 0.00, 0.00, 0.00, 0.00, 0.00, 1.00],\n",
    "    [0.00, 0.00, 0.00, 0.10, 0.90, 0.00, 0.00],\n",
    "]\n",
    "\n",
    "emission_probs_7_state = [\n",
    "    #   A     C     G     T\n",
    "    [0.30, 0.25, 0.25, 0.20],\n",
    "    [0.20, 0.35, 0.15, 0.30],\n",
    "    [0.40, 0.15, 0.20, 0.25],\n",
    "    [0.25, 0.25, 0.25, 0.25],\n",
    "    [0.20, 0.40, 0.30, 0.10],\n",
    "    [0.30, 0.20, 0.30, 0.20],\n",
    "    [0.15, 0.30, 0.20, 0.35],\n",
    "]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "0fb33618-de36-44df-8462-14b7e58d3b4b"
    }
   },
   "source": [
    "How do we use these matrices? Remember that we are given some sequence of observations, e.g. like this:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "collapsed": true,
    "nbpresent": {
     "id": "ed50140f-2197-43fc-837a-e3799ffe5904"
    }
   },
   "outputs": [],
   "source": [
    "obs_example = 'GTTTCCCAGTGTATATCGAGGGATACTACGTGCATAGTAACATCGGCCAA'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "2bef7e67-d545-447f-9d0c-7b6078ee627b"
    }
   },
   "source": [
    "To make a lookup in our three matrices, it is convenient to translate each symbol in the string to an index."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "collapsed": true,
    "nbpresent": {
     "id": "ed1f0acd-83cf-451a-ac94-957ebd8a0d87"
    }
   },
   "outputs": [],
   "source": [
    "def translate_observations_to_indices(obs):\n",
    "    mapping = {'a': 0, 'c': 1, 'g': 2, 't': 3}\n",
    "    return [mapping[symbol.lower()] for symbol in obs]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "6fcef106-fbc3-4f3d-ab54-5d89271eedf0"
    }
   },
   "source": [
    "Let's try to translate the example above using this function:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": true,
    "nbpresent": {
     "id": "8604d821-64fb-49e5-8c98-a791c6d42831"
    }
   },
   "outputs": [],
   "source": [
    "obs_example_trans = translate_observations_to_indices(obs_example)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "nbpresent": {
     "id": "9a29e558-111a-47bf-a450-2f7a61613ad4"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[2,\n",
       " 3,\n",
       " 3,\n",
       " 3,\n",
       " 1,\n",
       " 1,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 3,\n",
       " 2,\n",
       " 3,\n",
       " 0,\n",
       " 3,\n",
       " 0,\n",
       " 3,\n",
       " 1,\n",
       " 2,\n",
       " 0,\n",
       " 2,\n",
       " 2,\n",
       " 2,\n",
       " 0,\n",
       " 3,\n",
       " 0,\n",
       " 1,\n",
       " 3,\n",
       " 0,\n",
       " 1,\n",
       " 2,\n",
       " 3,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 3,\n",
       " 0,\n",
       " 2,\n",
       " 3,\n",
       " 0,\n",
       " 0,\n",
       " 1,\n",
       " 0,\n",
       " 3,\n",
       " 1,\n",
       " 2,\n",
       " 2,\n",
       " 1,\n",
       " 1,\n",
       " 0,\n",
       " 0]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "obs_example_trans"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Use the function below to translate the indices back to observations:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def translate_indices_to_observations(indices):\n",
    "    mapping = ['a', 'c', 'g', 't']\n",
    "    return ''.join(mapping[idx] for idx in indices)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'gtttcccagtgtatatcgagggatactacgtgcatagtaacatcggccaa'"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "translate_indices_to_observations(translate_observations_to_indices(obs_example))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now each symbol has been transformed (predictably) into a number which makes it much easier to make lookups in our matrices. We'll do the same thing for a list of states (a path):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def translate_path_to_indices(path):\n",
    "    return list(map(lambda x: int(x), path))\n",
    "\n",
    "def translate_indices_to_path(indices):\n",
    "    return ''.join([str(i) for i in indices])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Given a path through an HMM, we can now translate it to a list of indices:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[3,\n",
       " 3,\n",
       " 3,\n",
       " 3,\n",
       " 3,\n",
       " 3,\n",
       " 3,\n",
       " 3,\n",
       " 3,\n",
       " 3,\n",
       " 3,\n",
       " 3,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1,\n",
       " 0,\n",
       " 2,\n",
       " 1]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "path_example = '33333333333321021021021021021021021021021021021021'\n",
    "\n",
    "translate_path_to_indices(path_example)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "cf3f4b33-715a-443a-a933-17f29d3ffa75"
    }
   },
   "source": [
    "Finally, we can collect the three matrices in a class to make it easier to work with our HMM."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "nbpresent": {
     "id": "43336c35-2a6f-46e9-b86e-0367377dca39"
    }
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[[0.0, 0.0, 0.9, 0.1, 0.0, 0.0, 0.0],\n",
       " [1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],\n",
       " [0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0],\n",
       " [0.0, 0.0, 0.05, 0.9, 0.05, 0.0, 0.0],\n",
       " [0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0],\n",
       " [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0],\n",
       " [0.0, 0.0, 0.0, 0.1, 0.9, 0.0, 0.0]]"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "class hmm:\n",
    "    def __init__(self, init_probs, trans_probs, emission_probs):\n",
    "        self.init_probs = init_probs\n",
    "        self.trans_probs = trans_probs\n",
    "        self.emission_probs = emission_probs\n",
    "\n",
    "# Collect the matrices in a class.\n",
    "hmm_7_state = hmm(init_probs_7_state, trans_probs_7_state, emission_probs_7_state)\n",
    "\n",
    "# We can now reach the different matrices by their names. E.g.:\n",
    "hmm_7_state.trans_probs"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "For testing, here's another model (which we will refer to as the 3-state model)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "init_probs_3_state = [0.10, 0.80, 0.10]\n",
    "\n",
    "trans_probs_3_state = [\n",
    "    [0.90, 0.10, 0.00],\n",
    "    [0.05, 0.90, 0.05],\n",
    "    [0.00, 0.10, 0.90],\n",
    "]\n",
    "\n",
    "emission_probs_3_state = [\n",
    "    #   A     C     G     T\n",
    "    [0.40, 0.15, 0.20, 0.25],\n",
    "    [0.25, 0.25, 0.25, 0.25],\n",
    "    [0.20, 0.40, 0.30, 0.10],\n",
    "]\n",
    "\n",
    "hmm_3_state = hmm(init_probs_3_state, trans_probs_3_state, emission_probs_3_state)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "69bf9e07-95ad-429f-b85b-dbb8c1b6c310"
    }
   },
   "source": [
    "# 2 - Validating an HMM (and handling floats)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "4c1bd08c-2c49-45e6-81e7-5bbbbd2f799c"
    }
   },
   "source": [
    "Before using the model we'll write a function to validate that the model is valid. That is, the matrices should have the right dimensions and the following things should be true:\n",
    "\n",
    "1. The initial probabilities must sum to 1.\n",
    "2. Each row in the matrix of transition probabilities must sum to 1.\n",
    "3. Each row in the matrix of emission probabilities must sum to 1.\n",
    "4. All numbers should be between 0 and 1, inclusive.\n",
    "\n",
    "Write a function `validate_hmm` that given a model returns True if the model is valid, and False otherwise:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "collapsed": true,
    "nbpresent": {
     "id": "2c6c3d32-db85-482b-970c-9bcacd9f5b32"
    }
   },
   "outputs": [],
   "source": [
    "def validate_hmm(model):\n",
    "    pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "05430599-405e-4ffa-add6-84288abe4364"
    }
   },
   "source": [
    "We can now use this function to check whether the example model is a valid model."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "collapsed": true,
    "nbpresent": {
     "id": "1e3e74d8-7951-4362-af49-240b7d90532d"
    }
   },
   "outputs": [],
   "source": [
    "validate_hmm(hmm_7_state)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You might run into problems related to summing floating point numbers because summing floating point numbers does not (always) give the expected result as illustrated by the following examples. How do you suggest to deal with this?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9999999999999999"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "0.15 + 0.30 + 0.20 + 0.35"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The order of the terms matter."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "1.0"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "0.20 + 0.35 + 0.15 + 0.30"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Because it changes the prefix sums"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.44999999999999996"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "0.15 + 0.30"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.7000000000000001"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "0.20 + 0.35 + 0.15"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.44999999999999996"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "0.15 + 0.30"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On should never compare floating point numbers. They represent only an 'approximation'. Read more about the 'problems' in 'What Every Computer Scientist Should Know About Floating-Point Arithmetic' at:\n",
    "\n",
    "http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "7b9b636c-c901-4ea8-a99d-190a396b2405"
    }
   },
   "source": [
    "# 3 - Computing the Joint Probability"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Recall that the joint probability $p({\\bf X}, {\\bf Z}) = p({\\bf x}_1, \\ldots, {\\bf x}_N, {\\bf z}_1, \\ldots, {\\bf z}_N)$ of a hidden Markov model (HMM) can be compute as\n",
    "\n",
    "$$\n",
    "p({\\bf x}_1, \\ldots, {\\bf x}_N, {\\bf z}_1, \\ldots, {\\bf z}_N) = p({\\bf z}_1) \n",
    "\\left[ \\prod_{n=2}^N p({\\bf z}_n \\mid {\\bf z}_{n-1}) \\right]\n",
    "\\prod_{n=1}^N p({\\bf x}_n \\mid {\\bf z}_n)\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Implementing without log-transformation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Write a function `joint_prob` given a model (e.g. in the representation above) and sequence of observables, ${\\bf X}$, and a sequence of hidden states, ${\\bf Z}$, computes the joint probability cf. the above formula."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": true,
    "nbpresent": {
     "id": "3769ed05-effc-42d7-ae3b-655ea2082dc3"
    }
   },
   "outputs": [],
   "source": [
    "def joint_prob(model, x, z):\n",
    "    pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "bf292319-b283-4947-ba29-8db3efea4541"
    }
   },
   "source": [
    "Now compute the joint probability of the ${\\bf X}$ (`x_short`) and ${\\bf Z}$ (`z_short`) below using the 7-state (`hmm_7_state`) model introduced above. (*Remember to translate them first using the appropriate functions introduces above!*)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": true,
    "nbpresent": {
     "id": "51699e6f-c98d-4552-8d21-5e37153bab84"
    }
   },
   "outputs": [],
   "source": [
    "x_short = 'GTTTCCCAGTGTATATCGAGGGATACTACGTGCATAGTAACATCGGCCAA'\n",
    "z_short = '33333333333321021021021021021021021021021021021021'\n",
    "\n",
    "# Your code here ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "62cc8755-8cf4-4b52-9e47-73395c18e741"
    }
   },
   "source": [
    "## Implementing with log-transformation (i.e. in \"log-space\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "65813762-66da-4d50-8290-197eb23d5c90"
    }
   },
   "source": [
    "Now implement the joint probability function using log-transformation as explained in the lecture. We've given you a log-function that handles $\\log(0)$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": true,
    "nbpresent": {
     "id": "c4e0a0e0-1fec-462d-89ad-3797ba7d24f3"
    }
   },
   "outputs": [],
   "source": [
    "def log(x):\n",
    "    if x == 0:\n",
    "        return float('-inf')\n",
    "    return math.log(x)\n",
    "\n",
    "def joint_prob_log(model, x, z):\n",
    "    pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "d47a1d9b-de76-41cc-88de-694965dd1ce7"
    }
   },
   "source": [
    "Confirm that the log-transformed function is correct by comparing the output of `joint_prob_log` to the output of `joint_prob`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Your code here ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "06151990-65f6-401e-aa31-7adc39603c4b"
    }
   },
   "source": [
    "## Comparison of Implementations"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "10ae159d-95f8-47aa-a42b-6bda8e25a0a2"
    }
   },
   "source": [
    "Now that you have two ways to compute the joint probability given a model, a sequence of observations, and a sequence of hidden states, try to make an experiment to figure out how long a sequence can be before it becomes essential to use the log-transformed version. For this experiment we'll use two longer sequences."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x_long = 'TGAGTATCACTTAGGTCTATGTCTAGTCGTCTTTCGTAATGTTTGGTCTTGTCACCAGTTATCCTATGGCGCTCCGAGTCTGGTTCTCGAAATAAGCATCCCCGCCCAAGTCATGCACCCGTTTGTGTTCTTCGCCGACTTGAGCGACTTAATGAGGATGCCACTCGTCACCATCTTGAACATGCCACCAACGAGGTTGCCGCCGTCCATTATAACTACAACCTAGACAATTTTCGCTTTAGGTCCATTCACTAGGCCGAAATCCGCTGGAGTAAGCACAAAGCTCGTATAGGCAAAACCGACTCCATGAGTCTGCCTCCCGACCATTCCCATCAAAATACGCTATCAATACTAAAAAAATGACGGTTCAGCCTCACCCGGATGCTCGAGACAGCACACGGACATGATAGCGAACGTGACCAGTGTAGTGGCCCAGGGGAACCGCCGCGCCATTTTGTTCATGGCCCCGCTGCCGAATATTTCGATCCCAGCTAGAGTAATGACCTGTAGCTTAAACCCACTTTTGGCCCAAACTAGAGCAACAATCGGAATGGCTGAAGTGAATGCCGGCATGCCCTCAGCTCTAAGCGCCTCGATCGCAGTAATGACCGTCTTAACATTAGCTCTCAACGCTATGCAGTGGCTTTGGTGTCGCTTACTACCAGTTCCGAACGTCTCGGGGGTCTTGATGCAGCGCACCACGATGCCAAGCCACGCTGAATCGGGCAGCCAGCAGGATCGTTACAGTCGAGCCCACGGCAATGCGAGCCGTCACGTTGCCGAATATGCACTGCGGGACTACGGACGCAGGGCCGCCAACCATCTGGTTGACGATAGCCAAACACGGTCCAGAGGTGCCCCATCTCGGTTATTTGGATCGTAATTTTTGTGAAGAACACTGCAAACGCAAGTGGCTTTCCAGACTTTACGACTATGTGCCATCATTTAAGGCTACGACCCGGCTTTTAAGACCCCCACCACTAAATAGAGGTACATCTGA'\n",
    "z_long = '3333321021021021021021021021021021021021021021021021021021021021021021033333333334564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564563210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210321021021021021021021021033334564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564563333333456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456332102102102102102102102102102102102102102102102102102102102102102102102102102102102102102102102103210210210210210210210210210210210210210210210210210210210210210'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "nbpresent": {
     "id": "85880ba1-bc5b-455a-9335-000678ea9b0e"
    }
   },
   "source": [
    "Now compute the joint probability with `joint_prob` the 7-state (hmm_7_state) model introduced above, and see when it breaks (i.e. when it wrongfully becomes 0). Does this make sense? Here's some code to get you started."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": true,
    "nbpresent": {
     "id": "c558e928-6660-4b69-8db3-3341e8b19dba"
    }
   },
   "outputs": [],
   "source": [
    "for i in range(0, len(x_long), 100):\n",
    "    x = x_long[:i]\n",
    "    z = z_long[:i]\n",
    "    \n",
    "    x_trans = translate_observations_to_indices(x)\n",
    "    z_trans = translate_path_to_indices(z)\n",
    "    \n",
    "    # Make your experiment here..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the cell below you should state for which $i$ computing the joint probability (for the two models considered) using `joint_prob` wrongfully becomes 0."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Your answer here:**\n",
    "\n",
    "For the 7-state model, `joint_prob` becomes 0 for **i = ?**."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# 4 - Viterbi Decoding"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Below you will implement and experiment with the Viterbi algorithm. The implementation has been split into three parts:\n",
    "\n",
    "1. Fill out the $\\omega$ table using the recursion presented at the lecture.\n",
    "2. Find the state with the highest probability after observing the entire sequence of observations.\n",
    "3. Backtrack from the state found in the previous step to obtain the optimal path.\n",
    "\n",
    "We'll be working with the 7-state model (`hmm_7_state`) and the helper function for translating between observations, hidden states, and indicies, as introduced above."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Additionally, you're given the function below that constructs a table of a specific size filled with zeros."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def make_table(m, n):\n",
    "    \"\"\"Make a table with `m` rows and `n` columns filled with zeros.\"\"\"\n",
    "    return [[0] * n for _ in range(m)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You'll be testing your code with the same two sequences as above, i.e:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x_short = 'GTTTCCCAGTGTATATCGAGGGATACTACGTGCATAGTAACATCGGCCAA'\n",
    "z_short = '33333333333321021021021021021021021021021021021021'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "x_long = 'TGAGTATCACTTAGGTCTATGTCTAGTCGTCTTTCGTAATGTTTGGTCTTGTCACCAGTTATCCTATGGCGCTCCGAGTCTGGTTCTCGAAATAAGCATCCCCGCCCAAGTCATGCACCCGTTTGTGTTCTTCGCCGACTTGAGCGACTTAATGAGGATGCCACTCGTCACCATCTTGAACATGCCACCAACGAGGTTGCCGCCGTCCATTATAACTACAACCTAGACAATTTTCGCTTTAGGTCCATTCACTAGGCCGAAATCCGCTGGAGTAAGCACAAAGCTCGTATAGGCAAAACCGACTCCATGAGTCTGCCTCCCGACCATTCCCATCAAAATACGCTATCAATACTAAAAAAATGACGGTTCAGCCTCACCCGGATGCTCGAGACAGCACACGGACATGATAGCGAACGTGACCAGTGTAGTGGCCCAGGGGAACCGCCGCGCCATTTTGTTCATGGCCCCGCTGCCGAATATTTCGATCCCAGCTAGAGTAATGACCTGTAGCTTAAACCCACTTTTGGCCCAAACTAGAGCAACAATCGGAATGGCTGAAGTGAATGCCGGCATGCCCTCAGCTCTAAGCGCCTCGATCGCAGTAATGACCGTCTTAACATTAGCTCTCAACGCTATGCAGTGGCTTTGGTGTCGCTTACTACCAGTTCCGAACGTCTCGGGGGTCTTGATGCAGCGCACCACGATGCCAAGCCACGCTGAATCGGGCAGCCAGCAGGATCGTTACAGTCGAGCCCACGGCAATGCGAGCCGTCACGTTGCCGAATATGCACTGCGGGACTACGGACGCAGGGCCGCCAACCATCTGGTTGACGATAGCCAAACACGGTCCAGAGGTGCCCCATCTCGGTTATTTGGATCGTAATTTTTGTGAAGAACACTGCAAACGCAAGTGGCTTTCCAGACTTTACGACTATGTGCCATCATTTAAGGCTACGACCCGGCTTTTAAGACCCCCACCACTAAATAGAGGTACATCTGA'\n",
    "z_long = '3333321021021021021021021021021021021021021021021021021021021021021021033333333334564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564563210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210210321021021021021021021021033334564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564564563333333456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456456332102102102102102102102102102102102102102102102102102102102102102102102102102102102102102102102103210210210210210210210210210210210210210210210210210210210210210'"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Remember to translate these sequences to indices before using them with your algorithms."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Implementing without log-transformation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "First, we will implement the algorithm without log-transformation. This will cause issues with numerical stability (like above when computing the joint probability), so we will use the log-transformation trick to fix this in the next section."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Computation of the $\\omega$ table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def compute_w(model, x):\n",
    "    k = len(model.init_probs)\n",
    "    n = len(x)\n",
    "    \n",
    "    w = make_table(k, n)\n",
    "    \n",
    "    # Base case: fill out w[i][0] for i = 0..k-1\n",
    "    # ...\n",
    "    \n",
    "    # Inductive case: fill out w[i][j] for i = 0..k, j = 0..n-1\n",
    "    # ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Finding the joint probability of an optimal path"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now, write a function that given the $\\omega$-table, returns the probability of an optimal path through the HMM. As explained in the lecture, this corresponds to finding the highest probability in the last column of the table."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def opt_path_prob(w):\n",
    "    pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now test your implementation in the box below:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "w = compute_w(hmm_7_state, x_short)\n",
    "opt_path_prob(w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now do the same for `x_long`. What happens?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Your code here ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Obtaining an optimal path through backtracking"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Implement backtracking to find a most probable path of hidden states given the $\\omega$-table."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def backtrack(model, x, w):\n",
    "    pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "w = compute_w(hmm_7_state, x_short)\n",
    "z_viterbi = backtrack(hmm_7_state, x_short, w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now do the same for `x_long`. What happens?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Your code here ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Implementing with log-transformation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now implement the Viterbi algorithm with log-transformation. The steps are the same as above."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Computation of the (log-transformed) $\\omega$ table"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def compute_w_log(model, x):\n",
    "    k = len(model.init_probs)\n",
    "    n = len(x)\n",
    "    \n",
    "    w = make_table(k, n)\n",
    "    \n",
    "    # Base case: fill out w[i][0] for i = 0..k-1\n",
    "    # ...\n",
    "    \n",
    "    # Inductive case: fill out w[i][j] for i = 0..k, j = 0..n-1\n",
    "    # ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Finding the (log-transformed) joint probability of an optimal path"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def opt_path_prob_log(w):\n",
    "    pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "w = compute_w_log(hmm_7_state, x_short)\n",
    "opt_path_prob_log(w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now do the same for `x_long`. What happens?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Your code here ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Obtaining an optimal path through backtracking"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def backtrack_log(model, x, w):\n",
    "    pass"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "w = compute_w_log(hmm_7_state, x_short)\n",
    "z_viterbi_log = backtrack_log(hmm_7_state, x_short, w)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now do the same for `x_long`. What happens?"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Your code here ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Does it work?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Think about how to verify that your implementations of Viterbi (i.e. `compute_w`, `opt_path_prob`, `backtrack`, and there log-transformed variants `compute_w_log`, `opt_path_prob_log`, `backtrack_log`) are correct.\n",
    "\n",
    "One thing that should hold is that the probability of a most likely path as computed by `opt_path_prob` (or `opt_path_prob_log`) for a given sequence of observables (e.g. `x_short` or `x_long`) should be equal to the joint probability of a corersponding most probable path as found by `backtrack` (or `backtrack_log`) and the given sequence of observables. Why?\n",
    "\n",
    "Make an experiment that validates that this is the case for your implementations of Viterbi and `x_short` and `x_long`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Check that opt_path_prob is equal to joint_prob(hmm_7_state, x_short, z_viterbi)\n",
    "\n",
    "# Your code here ...\n",
    "\n",
    "# Check that opt_path_prob_log is equal to joint_prob_log(hmm_7_state, x_short, z_viterbi_log)\n",
    "\n",
    "# Your code here ...\n",
    "\n",
    "# Do the above checks for x_long ...\n",
    "\n",
    "# Your code here ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Do your implementations pass the above checks?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Does log-transformation matter?"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Make an experiment that investigates how long the input string can be before `backtrack` and `backtrack_log` start to disagree on a most likely path and its probability."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Your code here ..."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Your answer here:**\n",
    "\n",
    "For the 7-state model, `backtrack` and `backtrack_log` start to disagree on a most likely path and its probability for **i = ?** ."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 5 - Posterior Decoding"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "If you have time, try to implement posterior decoding (with scaling, if possible) as explained in the lecture"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "def forward(model, x):\n",
    "    pass\n",
    "\n",
    "def backward(model, x):\n",
    "    pass\n",
    "\n",
    "def posterior_decoding(model, x):\n",
    "    pass"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Compare the Viterbi and posterior decodings for the `x_short` and `x_long` using the 7-state model (`hmm_7_state`)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "# Your code here ..."
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}