- Andrea Manero-Bastin

# Essential Math for Data Science

This article was written by Tirthajyoti Sarkar. Below is a summary. The full article also features courses that you could attend to learn the topics listed below, as well as numerous comments. We also added a few topics that we think are important and missing in the original article.

**Statistics**

Data summaries and descriptive statistics, central tendency, variance, covariance, correlation,

Basic probability: basic idea, expectation, probability calculus, Bayes theorem, conditional probability,

Probability distribution functions — uniform, normal, binomial, chi-square, student’s t-distribution, Central limit theorem,

Sampling, measurement, error, random number generation,

Hypothesis testing, A/B testing, confidence intervals, p-values,

ANOVA, t-test

Linear and logistic regression, regularization

Decision trees

Robust and non-parametric statistics

**Linear Algebra**

Basic properties of matrix and vectors — scalar multiplication, linear transformation, transpose, conjugate, rank, determinant,

Inner and outer products, matrix multiplication rule and various algorithms, matrix inverse,

Special matrices — square matrix, identity matrix, triangular matrix, idea about sparse and dense matrix, unit vectors, symmetric matrix, Hermitian, skew-Hermitian and unitary matrices,

Matrix factorization concept/LU decomposition, Gaussian/Gauss-Jordan elimination, solving Ax=b linear system of equation,

Vector space, basis, span, orthogonality, orthonormality, linear least square,

Eigenvalues, eigenvectors, and diagonalization, singular value decomposition (SVD)

**Calculus**

Functions of single variable, limit, continuity and differentiability,

Mean value theorems, indeterminate forms and L’Hospital rule,

Maxima and minima,

Product and chain rule,

Taylor’s series, infinite series summation/integration concepts

Fundamental and mean value-theorems of integral calculus, evaluation of definite and improper integrals,

Beta and Gamma functions,

Functions of multiple variables, limit, continuity, partial derivatives,

Basics of ordinary and partial differential equations (not too advanced)

**Discrete Math**

Sets, subsets, power sets

Counting functions, combinatorics, countability

Basic Proof Techniques — induction, proof by contradiction

Basics of inductive, deductive, and propositional logic

Basic data structures- stacks, queues, graphs, arrays, hash tables, trees

Graph properties — connected components, degree, maximum flow/minimum cut concepts, graph coloring

Recurrence relations and equations

Growth of functions and O(n) notation concept

**Optimization, Operations Research**

Basics of optimization —how to formulate the problem

Maxima, minima, convex function, global solution

Linear programming, simplex algorithm

Integer programming

Constraint programming, knapsack problem

Randomized optimization techniques — hill climbing, simulated annealing, Genetic algorithms

Some opinions expressed in this article may be those of a guest author and not necessarily Analytikus. Staff authors are listed in https://www.datasciencecentral.com/profiles/blogs/essential-math-for-data-science-why-and-how

#Machinelearning #data #bigdata #datascience #artificialintelligence #ai #analytics