All about Chained Matrix Products

Suppose that you wake up one morning and discover the following note under your pillow:

Good morning:

I left on your desk a stack of matrices. Your mission is to compute their product. You have to do it in a way that will minimize the total number of scalar products.

As usual, destroy this message after you read it. We shall come this evening to collect the result.


Recall that matrix product is an associative operation thus the result is independent of the way you partition the sequence of matrices when you compute the product. For example, if we have to compute the product P=ABCD where A, B, C and D are some given matrices with appropriate dimensions, then

A(B(CD)) = A((BC)D) = (AB)(CD) =((AB)C)D

Also observe that if A is an m by k matrix and B is a k by n matrix, then the prodcut AB involves m*k*n scalar products.

It is a fact of life that the total number of scalar products executed in the course of a chained matrix multiplication can vary significantly depending on how we parenthesize the sequence. For example., consider the following case:

Matrix Size









Here is a summary of the computational effort required by the five alternative parenthesization schemes:


Total number of scalar products











It is thus clear that much can be gained by optimizing the parenthesization scheme.

Suppose then that we have n matrices to multiply, call them M(1),...,M(n) and let (r(j),c(j)) be the size of M(j) , j=1,2,...,n.

The dynamic programming solution can be derived as follows: let

f(i,j) :=

Minimum number of scalar products required for multiplying the sub-chain M(i),...,M(j), (n >= j > i >= 1)

If we partition this sub-chain into two sub-sub-chains, say M(i),...,M(k) and M(k+1),...,M(j) then the minimum total number of scalar products will be

f(i,j;k) :=

f(i,k) + f(k+1,j) + r(i)*c(k)*c(j)

Thus, the optimal value of k is that which minimize this expression over all feasible values of k. Hence,

f(i,j) =

min {f(i,k) + f(k+1,j) + r(i)*c(k)*c(j) : k=i,...,j-1} , j > i

observing that f(j,j) = 0 for all j.

To solve this functional equation, we generate all pairs (i,j), such that j=i+s, where s varies from 1 to n-1. For example, for n=5 the functional equation is solved in the following order for the (i,j) pairs:

(1,2),(2,3),(3,4),(4,5) , (s=1)
(1,3),(2,4),(3,5), (s=2)
(1,4),(2,5) , (s=3)
(1,5) , (s=4)

This way, all the terms on the right-hand side of the functional equation that are required for computing the value of f(i,j) have already been computed.

the module display two results:

The optimal partition tree is displayed as a color matrix. Each row displays the optimal partition of the row above it. Here is an example:

Optimal Partition Tree
 [1]  [2] [ 3]   [4]  [5]  [6]  [7]  [8]   [9] [10][11][12][13][14][15]

Thus the optimal partition can be described in words as follows: The initial chain consists of 15 matrices (first row). This chain is partitioned into two parts: the first consists of the first matrix (red color in the second row), the second of the remaining 14 matrices (blue color in the second row). The second part of this chain (blue color in second row) is partitioned into two sub-chains: the first consists of the second matrix (brown color in third row) and the second of the remaining matrices (green color in the third row), and so on.

© The University of Melbourne 1994-2000.
Disclaimer and Copyright Information.
Conditions of use.

Date created: January 15, 2000
Date last modified: February 15, 2000
Authorised by: Moshe Sniedovich
Maintained by: Moshe Sniedovich, Department of Mathematics and Statistics.