top of page
Andrea Manero-Bastin

Essential Math for Data Science


This article was written by Tirthajyoti Sarkar. Below is a summary. The full article also features courses that you could attend to learn the topics listed below, as well as numerous comments. We also added a few topics that we think are important and missing in the original article.

Statistics

  • Data summaries and descriptive statistics, central tendency, variance, covariance, correlation,

  • Basic probability: basic idea, expectation, probability calculus, Bayes theorem, conditional probability,

  • Probability distribution functions — uniform, normal, binomial, chi-square, student’s t-distribution, Central limit theorem,

  • Sampling, measurement, error, random number generation,

  • Hypothesis testing, A/B testing, confidence intervals, p-values,

  • ANOVA, t-test

  • Linear and logistic regression, regularization

  • Decision trees

  • Robust and non-parametric statistics

Linear Algebra

  • Basic properties of matrix and vectors — scalar multiplication, linear transformation, transpose, conjugate, rank, determinant,

  • Inner and outer products, matrix multiplication rule and various algorithms, matrix inverse,

  • Special matrices — square matrix, identity matrix, triangular matrix, idea about sparse and dense matrix, unit vectors, symmetric matrix, Hermitian, skew-Hermitian and unitary matrices,

  • Matrix factorization concept/LU decomposition, Gaussian/Gauss-Jordan elimination, solving Ax=b linear system of equation,

  • Vector space, basis, span, orthogonality, orthonormality, linear least square,

  • Eigenvalues, eigenvectors, and diagonalization, singular value decomposition (SVD)

Calculus

  • Functions of single variable, limit, continuity and differentiability,

  • Mean value theorems, indeterminate forms and L’Hospital rule,

  • Maxima and minima,

  • Product and chain rule,

  • Taylor’s series, infinite series summation/integration concepts

  • Fundamental and mean value-theorems of integral calculus, evaluation of definite and improper integrals,

  • Beta and Gamma functions,

  • Functions of multiple variables, limit, continuity, partial derivatives,

  • Basics of ordinary and partial differential equations (not too advanced)

Discrete Math

  • Sets, subsets, power sets

  • Counting functions, combinatorics, countability

  • Basic Proof Techniques — induction, proof by contradiction

  • Basics of inductive, deductive, and propositional logic

  • Basic data structures- stacks, queues, graphs, arrays, hash tables, trees

  • Graph properties — connected components, degree, maximum flow/minimum cut concepts, graph coloring

  • Recurrence relations and equations

  • Growth of functions and O(n) notation concept

Optimization, Operations Research

  • Basics of optimization —how to formulate the problem

  • Maxima, minima, convex function, global solution

  • Linear programming, simplex algorithm

  • Integer programming

  • Constraint programming, knapsack problem

  • Randomized optimization techniques — hill climbing, simulated annealing, Genetic algorithms

Some opinions expressed in this article may be those of a guest author and not necessarily Analytikus. Staff authors are listed in https://www.datasciencecentral.com/profiles/blogs/essential-math-for-data-science-why-and-how

111 views0 comments

POST

bottom of page