top of page
Search

# Essential Math for Data Science

This article was written by Tirthajyoti Sarkar. Below is a summary. The full article also features courses that you could attend to learn the topics listed below, as well as numerous comments. We also added a few topics that we think are important and missing in the original article.

Statistics

• Data summaries and descriptive statistics, central tendency, variance, covariance, correlation,

• Basic probability: basic idea, expectation, probability calculus, Bayes theorem, conditional probability,

• Probability distribution functions — uniform, normal, binomial, chi-square, student’s t-distribution, Central limit theorem,

• Sampling, measurement, error, random number generation,

• Hypothesis testing, A/B testing, confidence intervals, p-values,

• ANOVA, t-test

• Linear and logistic regression, regularization

• Decision trees

• Robust and non-parametric statistics

Linear Algebra

• Basic properties of matrix and vectors — scalar multiplication, linear transformation, transpose, conjugate, rank, determinant,

• Inner and outer products, matrix multiplication rule and various algorithms, matrix inverse,

• Special matrices — square matrix, identity matrix, triangular matrix, idea about sparse and dense matrix, unit vectors, symmetric matrix, Hermitian, skew-Hermitian and unitary matrices,

• Matrix factorization concept/LU decomposition, Gaussian/Gauss-Jordan elimination, solving Ax=b linear system of equation,

• Vector space, basis, span, orthogonality, orthonormality, linear least square,

• Eigenvalues, eigenvectors, and diagonalization, singular value decomposition (SVD)

Calculus

• Functions of single variable, limit, continuity and differentiability,

• Mean value theorems, indeterminate forms and L’Hospital rule,

• Maxima and minima,

• Product and chain rule,

• Taylor’s series, infinite series summation/integration concepts

• Fundamental and mean value-theorems of integral calculus, evaluation of definite and improper integrals,

• Beta and Gamma functions,

• Functions of multiple variables, limit, continuity, partial derivatives,

• Basics of ordinary and partial differential equations (not too advanced)

Discrete Math

• Sets, subsets, power sets

• Counting functions, combinatorics, countability

• Basic Proof Techniques — induction, proof by contradiction

• Basics of inductive, deductive, and propositional logic

• Basic data structures- stacks, queues, graphs, arrays, hash tables, trees

• Graph properties — connected components, degree, maximum flow/minimum cut concepts, graph coloring

• Recurrence relations and equations

• Growth of functions and O(n) notation concept

Optimization, Operations Research

• Basics of optimization —how to formulate the problem

• Maxima, minima, convex function, global solution

• Linear programming, simplex algorithm

• Integer programming

• Constraint programming, knapsack problem

• Randomized optimization techniques — hill climbing, simulated annealing, Genetic algorithms

Some opinions expressed in this article may be those of a guest author and not necessarily Analytikus. Staff authors are listed in https://www.datasciencecentral.com/profiles/blogs/essential-math-for-data-science-why-and-how