numeric-linalg
Educational material on the SciPy implementation of numerical linear algebra algorithms
Name | Size | Mode | |
.. | |||
linear-solvers.ipynb | 7388B | -rw-r--r-- |
001 002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 061 062 063 064 065 066 067 068 069 070 071 072 073 074 075 076 077 078 079 080 081 082 083 084 085 086 087 088 089 090 091 092 093 094 095 096 097 098 099 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160
{ "cells": [ { "cell_type": "markdown", "id": "e1791697-e733-4721-9e10-fcdcfbda9064", "metadata": {}, "source": [ "# Solving Linear systems in SciPy\n", "\n", "A linear system is a system of equations of the form\n", "$$ \\left\\{ \\begin{aligned} a_{1 1} x_1 + a_{1 2} x_2 + \\cdots + a_{1 n} x_n &= b_1 \\\\ a_{2 1} x_1 + a_{2 2} x_2 + \\cdots + a_{2 n} x_n &= b_2 \\\\ & \\vdots \\\\ a_{n 1} x_1 + a_{n 2} x_2 + \\cdots + a_{n n} x_n &= b_n\\end{aligned} \\right.$$\n", "on the variables $x_1, \\ldots, x_n$.\n", "\n", "Solving this system is equivalent to solving the equation $A x = b$ on $x$ where $A = (a_{ij})_{ij}$ is a $n\\times n$ matrix and $x = (x_1, \\ldots, x_n) \\; \\& \\; b = (b_1, \\ldots, b_n)$ are vectors, which can always be done provided $A$ is invertible." ] }, { "cell_type": "markdown", "id": "26bc6a30-7853-4c78-adfb-320f0a65dd10", "metadata": {}, "source": [ "In SciPy, we can solve linear systems using the `la.solve` function." ] }, { "cell_type": "code", "execution_count": 1, "id": "b1ced47f-6783-48ed-a4e4-4f1f4e2b8835", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[-0.12672176],\n", " [ 0.1046832 ],\n", " [ 1.19008264]])" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import numpy as np\n", "import scipy.linalg as la\n", "\n", "A, b = np.array([[6,15,1],[8,7,12],[2,7,8]]), np.array([[2], [14], [10]])\n", "la.solve(A, b)" ] }, { "cell_type": "markdown", "id": "bfdacdbf-150f-4b4d-8072-ee18758d3b60", "metadata": {}, "source": [ "## But how does `la.solve` work???\n", "\n", "Internally, the `la.solve` function uses the the [LAPACK library](https://netlib.org/lapack), a Fortran package for numerical linear algebra. The LAPACK generic linear solver algorithm goes something like the following:\n", "\n", "1. Decompose $A$ as $A = PL U$, where $P$ is permutation matrix, $L$ is a lower triangular matrix with $1$ in the diagonal and $U$ is an upper triangular matrix.\n", "3. Solve $P b' = b$ for $b'$, i.e. compute $b' = P^{-1} b$. Since $P$ is a permutation matrix, this operation is $O(n)$.\n", "4. Solve $L b'' = b'$ for $b''$, i.e. compute $b'' = L^{-1} P^{-1} b$. Since $L$ is known to be lower triangular, this operation is $O(n^2)$.\n", "3. Solve $U x = b''$ for $x$, i.e. compute $x = U^{-1} L^{-1} P^{-1} b = A^{-1} b$. Since $U$ is known to be upper triangular, this operation is $O(n^2)$.\n", "\n", "This is implemented in the `GETRS` family of subroutines." ] }, { "cell_type": "markdown", "id": "5b7ad45f-ea81-4d46-9097-735cc159cf1a", "metadata": {}, "source": [ "As for the decomposition of $A$ in the first step, LAPACK uses a method called [_partial pivoting_](https://en.wikipedia.org/wiki/LU_decomposition#LU_factorization_with_partial_pivoting). A simple simple recurssive algorithm using such method might look something like the following:\n", "\n", "1. If $A = a_{11}$ is $1 \\times 1$ then take $P = L = 1$ and $U = a_{11}$.\n", "2. If $A$ is $n \\times n$ for $n > 1$, choose $i_0$ that maximizes $|a_{i_0, 1}|$ and consider the $n \\times n$ permutation matrix $S_{i_0}$ that swaps the first and $i_0$-th basis vectors. Searching for $i_0$ is an $O(n)$ operation.\n", "3. Write\n", " $$S_{i_0} A = \\left( \\begin{array}{c|c} a_{i_0} & A_{12}' \\\\ \\hline A_{21}' & A_{22}' \\end{array} \\right), $$\n", " where $A_{22}'$ is $(n - 1) \\times (n - 1)$ and $a_{i_0} \\ne 0$ — given $A$ is invertible. Since $S_{i_0}$ acts on $A$ by swaping the first and $i_0$-th rows, computing $S_{i_0} A$ is an $O(n)$ operation.\n", "4. We want to solve the equation\n", " $$S_{i_0} A = \\left( \\begin{array}{c|c} 1 & 0 \\\\ \\hline 0 & P_{22} \\end{array} \\right) \\left( \\begin{array}{c|c} 1 & 0 \\\\ \\hline L_{21} & L_{22} \\end{array} \\right) \\cdot \\left( \\begin{array}{c|c} u_{11} & U_{12} \\\\ \\hline 0 & U_{22} \\end{array} \\right),$$\n", " where $P_{22}$ is a permutation matrix, $L_{22}$ is lower triangular with $1$ in the diagonal entries and $U_{22}$ is upper triangular. In other words, we want to solve the equations\n", " $$\n", " \\begin{aligned}\n", " a_{i_0} &= u_{11} & A_{12}' &= U_{12} \\\\\n", " A_{21}' &= u_{11} P_{22} L_{21} & A_{22}' &= P_{22} L_{21} U_{12} + P_{22} L_{22} U_{22}.\n", " \\end{aligned}\n", " $$\n", " We must take $u_{11} = a_{i_0}$, $U_{12} = A_{12}'$ and $L_{21} = a_{i_0}^{-1} P_{22}^{-1} A_{12}'$, so it remains to solve the bottom-right equation.\n", "5. Write $(A_{22}' - a_{i_0}^{-1} A_{21}' A_{12}') = P_{22} L_{22} U_{22}$, where $P_{22}$ is a permutation matrix, $L_{22}$ is lower triangular with $1$ in the diagonals and $U_{22}$ is upper triangular. Computing $A_{21}' A_{12}'$ (and thus $A_{22}' - a_{i_0}^{-1} A_{21}' A_{12}'$) is, of course, a $O(n^3)$ operation. Now since $P_{22}$ is a permutation matrix, computing $L_{21} = a_{i_0}^{-1} P_{22}^{-1} A_{12}$ is an $O(n^2)$ operation.\n", "7. Take\n", " $$\n", " \\begin{aligned}\n", " L &= \\begin{pmatrix} 1 & 0 \\\\ L_{21} & L_{22} \\end{pmatrix} &\n", " U &= \\begin{pmatrix} u_{11} & U_{12} \\\\ 0 & U_{22} \\end{pmatrix}\n", " \\end{aligned}\n", " $$\n", " for $L_{21}, L_{22}, u_{11}, U_{12}, U_{22}$ as above, so that\n", " $$\n", " S_{i_0} A = \\begin{pmatrix} 1 & 0 \\\\ 0 & P_{22} \\end{pmatrix} L U.\n", " $$\n", "8. Hence by taking\n", " $$\n", " P = S_{i_0} \\begin{pmatrix} 1 & 0 \\\\ 0 & P_{22} \\end{pmatrix}\n", " $$\n", " we get $A = P L U$ as desired!\n", "\n", "In total, this algorithm takes $n$ recursive steps to solve $A = P L U$. Since each step involves $O(n^3)$ operations, the total complexity of our facotization algorithm is $O(n^4)$." ] }, { "cell_type": "markdown", "id": "7c6678f2-9ae6-4daa-922c-828771c1a796", "metadata": {}, "source": [ "The actual LAPACK algorithm for factorizing $A$ is a slight variation of this concept, where we instead take the decomposition\n", "$$\n", "A =\n", "\\left(\n", "\\begin{array}{c|c}\n", " A_{11} & A_{12} \\\\ \\hline\n", " A_{21} & A_{22}\n", "\\end{array}\n", "\\right)\n", "$$\n", "with $A_{11}$ is a $\\left\\lfloor \\frac{\\min \\{m, n\\}}{2} \\right\\rfloor \\times \\left\\lfloor \\frac{\\min \\{m, n\\}}{2} \\right\\rfloor$ matrix. This is implemented in the `GETRF` family of subroutines." ] }, { "cell_type": "code", "execution_count": null, "id": "229ed6e3-493e-49c3-835a-0b8582aa6586", "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.7" } }, "nbformat": 4, "nbformat_minor": 5 }