List of Foundational Mathematical Concepts for AI Researchers

Motivation: There are three main reasons for learning mathematics for AI/ML researchers or in general folks looking to gain deeper understanding in AI. Firstly, you may have used various deep learning tools and developed useful projects and want to gain deeper insights into what is chain rule, what is a partial derivative, what is KL-divergence. Secondly, you want to do theoretical research in AI/ML. Thirdly, you want to rely on mathematical arguments for motivation in creating new AI solutions (for some people, instead of mathematical intuitions, they rely on intuitions from psychology, cognitive science, and other disciplines).

What is provided below is a list of some important topics in foundational mathematics. As one can expect, this list is incomplete and I am working to continuously expanding it.

(*) Means something that is slightly more advanced and can be omitted

  • Analysis
  • Set Theory: Union, Intersection, Complement, Limit Supremum and Limit Infimum; Axiomatic Set Theory (*): ZFC, Axiom of Choice
  • Topological Spaces: Topology, Limit points, Closure, (Topologically) Continuous functions, Compactness
  • Metric Spaces: Euclidean distance, Minkowski distance, Topology induced by Metric Space, Borel Set
  • Convergence of a Sequence and Series, ε-δ continuity, Uniform Continuity , Derivative, Complete Set, Totally Bounded Set
  • Normed Space: Metric induced by norm, Banach Spaces, p-norm, inequalities of p-norm
  • Sequential Compactness, Bolzano-Weierstrass's theorem, Heine-Borel Theorem, Heine-Cantor Theorem, Arzela-Ascoli Theorem (*)
  • Inner Product Space: Hilbert Space, Orthogonality, Cauchy-Schwarz inequality
  • Functional Analysis Basics: Open Mapping Theorem, Banach Theorem, Lp Spaces, Holder's inequality, Minkowski's inequality
  • Reproducing Hilbert Space
  • Representation Theory: Taylor Series, Fourier Series, Stone–Weierstrass theorem (*)
  • Measure Theory and Probability Theory
  • Algebra of Subsets, σ-algebra, Borel σ-algebra
  • Measurability Space, Measurable Sets, Measurable Function
  • Outer Measure, Measure, Carathéodory's criterion, Measure Space, Measurable Sets, Dynkin's πλ-theorem
  • Lebesgue Measure, Invariancy to Translation, Non-Measurable Sets (*): Vitaly Sets
  • Lebesgue Integral, Radon-Nikodym Derivative
  • Probability Space: Unitarity, Kolmogorov Axioms, Conditional Probabilities, Bayes' Theorem, Independence
  • Random Variables, Expectation, Variance and Moments, Definition of Conditional Expectation, Chain Rule
  • Cummulative Distribution Function: Properties and Uniqueness of Distribution, Probability Distribution Function
  • Distribution: Bernoulli, Multinomial, Categorial, Binomial, Gaussian, Multivariate Gaussian, Dirichlet, Gamma, Poisson, Laplace, and Geometric.
  • Statistics
  • Statistics (e.g., mean, median), sufficient statistics
  • Central Limit Theorem
  • Monte-Carlo Sampling
  • Law of Large numbers
  • Concentration Inequalites: Markov inequality, Chernoff's inequality, Hoeffding's inequality, Azuma's inequality, Bernstein's inequality, Bennett's Inequality
  • Martingale Theory, Filtration, Mixing Time
  • Stochastic Processes: E.g., Gaussian process, Levi process, Brownian motion
  • Linear Algebra and Matrix Theory
  • Matrix: Algebra, Transpose, Trace, Square Matrices, Left and Right Inverse
  • Determinant, Determinant of Transpose, Invertible Matrices and Determinant Condition
  • Linear system of equations: Gaussian Elimination, Elementary Matrices, Adjoint Matricx, and Cramer's rule
  • Vector Space: Definition, Linear Span, Linear Independence, Basis, Existence of Basis, Basis Theorem
  • Row and Column Space, Rank of a Matrix, Rank-Nullity Theorem, Fundamental Theorems of Linear Algebra
  • Orthogonal Matrices, Rank of Orthogonal Matrix
  • Eigenvalues and Eigen vectors, Spectral Radius, Characteristic Polynomial, Trace = sum of eigen values, Determinant = product of eigen values, Rayleigh quotient, Algerbraic Multiplicity, Geometric Multiplicity
  • Eigen value algorithm: Power iteration
  • Similar Matrices: Same characteristic polynomial
  • Diagonalizable matrix, Diaognalizable iff Algebraic Multiplicity = Geometric Multiplicity
  • Spectral decomposition for Positive Semi-Definite Matrix
  • Matrix Norm: Sub-multiplicativity, Elementwise norms e.g., Frobenius norm, Operator Norm
  • Singular values: Positivity, L2 Operator norm = Largest Singular Value, Rank of matrix = number of non-zero singular values, Schatten Norm (*), Ky-Fan Norm (*)
  • Matrix Decomposition: SVD, QR, LU, Cholesky
  • Generalized Eigenvalues (*), Jordan Normal Form (*), Relation to Algebraic and Geometric Multiplicity
  • Majorization and eigenvalue inequalities: Schur's theorem, Ky-Fan theorem, Weyl's inequality
  • Positive Eigenvalues: Perron–Frobenius theorem
  • Optimization
  • Convex Sets: Closure, Interior, Relative Interior, Affine Spaces, Convex Hull, Polyhedron, Hyperplane separation theorem, Hyperplane supporting theorem
  • Convex Function: First and second order conditions, subgradient, Jensen's inequality
  • Convex Conjugate, Biconjugate, Fenchel-Moreau Theorem
  • Gradient Descent and Convergence for Convex Functions
  • Stochastic Gradient Descent
  • Nesterov Momentum
  • Natural Gradient
  • Newton-Raphson Method
  • Second Order Methods
  • Zeroth Order Optimization
  • Duality: Lagrangian Duality, Weak Duality, Strong Duality, von-Neumann Minimax theorem, Fenchel's Duality
  • Non-linear optimization with Gradient domination (*)
  • Vector Calculus
  • Gradient and Hessian, Chain Rule
  • Partial Derivatives and Jacobian Matrices
  • Taylor Series in multiple variable
  • Matrix Calculus: Derivative of product, sum, trace, adjoint, inverse. Jacobi's derivative formula
  • Curl and Divergence (*), Stoke's theorem and Divergence Theorem (*)
  • Differential Equations
  • Linear Differential Equations and examples
  • Solving First Order Linear Differential Equations with Integrating Factor(*)
  • Partial Differential Equations: Laplace's Equation and Poisson's Equation(*)
  • Solving Laplace's Equation using Separation of Variable(*)
  • Special Equations: Legendre's differential equation, Bessels's differential equation etc. (*)
  • Fourier Series
  • Fourier Transform
  • Complex Analysis
  • Complex Numbers and Complex Algebra(*)
  • Complex Functions(*)
  • Cauchy-Riemann Equations(*)
  • Cauchy Theorem and Cauchy Residue Theorem(*)
  • Laurent's Series (*)
  • Liouville's theorem (*)