Markov Chain - A Practical Example


Intro to the problem

In this post I will show a practical example of markov chain.

Let’s try to map the movement of freelancer drivers in Dhaka. We can divide the area of Dhaka into three zones – North Dhaka, Middle Dhaka and South Dhaka.
So we can find the following probabilities for the movement of a driver:

  • Among the North Dhaka’s freelancer drivers, 30% will remain in North Dhaka, 30% will move to Middle Dhaka, while the remaining 40% will go to South Dhaka.
  • Of all the drivers in Middle Dhaka, 50% and 30% will move to North Dhaka and South Dhaka, respectively; 20% will remain in Middle Zone.
  • In the South Dhaka, drivers have a 40% chance of moving to North Dhaka, 40% chance of staying in the South Dhaka; 20% drivers will move to Middle Dhaka.

Once a driver is in a particular zone, he can either move to the next zone or stay back in the same zone. His movement will be decided only by his current state and not by the sequence of past states.

The state space in this example includes North Dhaka, Middle Dhaka and South Dhaka. It follows all the properties of Markov Chains because the current state has the power to predict the next stage.

Transition Probability Matrix

Let’s construct the transition probability matrix -

zone <- c("North","Middle","South")
zoneTPM <- matrix(c(0.3,0.3,0.4,
                    0.5,0.2,0.3,
                    0.4,0.2,0.4), 
                  nrow=3, byrow=T, dimnames = list(zone,zone))

Hence, the TPM is -

zoneTPM
#        North Middle South
# North    0.3    0.3   0.4
# Middle   0.5    0.2   0.3
# South    0.4    0.2   0.4

The package named markovchain can help us in implementing Markov Chains in R.

Install and load the package -

# install.packages("markovchain")
library(markovchain)

Now using the function new() create a Markov Chain object -

driverzone <- new("markovchain",
                  states = zone,
                  transitionMatrix = zoneTPM,
                  name = "Driver Movement")
driverzone
# Driver Movement 
#  A  3 - dimensional discrete Markov Chain defined by the following states: 
#  North, Middle, South 
#  The transition matrix  (by rows)  is defined as follows: 
#        North Middle South
# North    0.3    0.3   0.4
# Middle   0.5    0.2   0.3
# South    0.4    0.2   0.4

Transition Diagram

To plot the above transition matrix we can use R package, diagram. The package has a function called plotmat() that can help us in plotting a state space diagram of the transition matrix in an easy-to-understand manner.

Install and load the package -

# install.packages("diagram")
library(diagram)

Code for plotting the transition probability diagram -

plotmat(A = zoneTPM,
        box.type = "circle",          # shape of box
        box.lwd = 1,                  # border density of box
        relsize = 0.9,                # scaling factor for size of graph
        cex.txt = 0.7,                # size of probabilities
        lwd = 1,                      # border density of state to state arrows
        lcol = "black",
        box.col = "cornsilk1",
        box.size = 0.1,               # size of box
        box.prop = 0.6,               # height to width ratio of box
        arr.length = 0.5, arr.width = 0.2,
        self.cex = 0.5,                          # size of self probability box
        self.shifty = -0.03, self.shiftx = 0.13, # location of self prob. box
        name = c("North Dhaka","Middle Dhaka","South Dhaka"),  # Optional
        main = "Transition Diagram",
        cex.main = 1.3                # relative size of main title
        )

The above Markov Chain can be used to answer some of the future state questions.

Questions

Assuming that a driver is currently in the North Zone, what is the probability that the driver will again be in the North Zone after two trips?

Answer:

Manually:

A driver can reach the North Zone again in his second trip in three different ways:

  1. Staying in the same zone i.e. North Dhaka to North Dhaka Probability: P(N-N) = 0.3*0.3 = 0.09
  2. Middle Dhaka to North Dhaka Probability: P(M-N) = 0.4*0.4 = 0.12
  3. South Dhaka to North Dhaka Probability: P(S-N) = 0.5*0.4 = 0.2

Therefore, probability of second trip to North = 0.09 + 0.12 + 0.20 = 0.41 (0.40 approximately)

Using code:

Using the markov chain object created earlier using the function new() -

driverzone^2
# Driver Movement^2 
#  A  3 - dimensional discrete Markov Chain defined by the following states: 
#  North, Middle, South 
#  The transition matrix  (by rows)  is defined as follows: 
#        North Middle South
# North   0.40   0.23  0.37
# Middle  0.37   0.25  0.38
# South   0.38   0.24  0.38

This gives us the direct probability of a driver coming back to the North Zone after two trips.

The calculation can be done for subsequent trips as well. For example, the probability of coming back to North Zone in 4th trip :

driverzone^3
# Driver Movement^3 
#  A  3 - dimensional discrete Markov Chain defined by the following states: 
#  North, Middle, South 
#  The transition matrix  (by rows)  is defined as follows: 
#        North Middle South
# North  0.383  0.240 0.377
# Middle 0.388  0.237 0.375
# South  0.386  0.238 0.376

However if we increase n (No. of trips) the predictive power tends to go down, where the Markov Chain reaches an equilibrium called stationary state. In the above case, after 9 trips the state becomes stationary -

driverzone^9
# Driver Movement^9 
#  A  3 - dimensional discrete Markov Chain defined by the following states: 
#  North, Middle, South 
#  The transition matrix  (by rows)  is defined as follows: 
#            North    Middle     South
# North  0.3853211 0.2385321 0.3761468
# Middle 0.3853211 0.2385321 0.3761468
# South  0.3853211 0.2385321 0.3761468

Steady State

The stationary state can be calculated using some linear algebra methods; however, the driverzone package has a function called steadyStates() that can do the work -

steadyStates(driverzone)
#          North    Middle     South
# [1,] 0.3853211 0.2385321 0.3761468

So we can say that, in the stationary state, a driver has a probability of 0.385 of ending up in the North Dhaka.

If there were 100 drivers in all and each completes 9 trips in a day, how many of them will end up in the Middle Dhaka?

steadyStates(driverzone)*100
#         North   Middle    South
# [1,] 38.53211 23.85321 37.61468

So, around 38 drivers will end up in North Dhaka, 24 in the Middle Dhaka and 37 in the South Dhaka.

Reference

  1. AN INTRODUCTION TO MARKOV CHAINS USING R by CHAITANYA SAGAR

Md Ahsanul Islam
Md Ahsanul Islam
Freelance Data Analysis and R Programmer

Statistics graduate student currently researching on econometrics