40 Techniques Used by Data Scientists

November 9, 2017

These techniques cover most of what data scientists and related practitioners are using in their daily activities, whether they use solutions offered by a vendor, or whether they design proprietary tools. When you click on any of the 40 links below, you will find a selection of articles related to the entry in question. Most of these articles are hard to find with a Google search, so in some ways this gives you access to the hidden literature on data science, machine learning, and statistical science. Many of these articles are fundamental to understanding the technique in question, and come with further references and source code.

 

 

Starred techniques (marked with a *) belong to what I call deep data science, a branch of data science that has little if any overlap with closely related fields such as machine learning, computer science, operations research, mathematics, or statistics. Even classical machine learning and statistical techniques such as clustering, density estimation,  or tests of hypotheses, have model-free, data-driven, robust versions designed for automated processing (as in machine-to-machine communications), and thus also belong to deep data science. However, these techniques are not starred here, as the standard versions of these techniques are more well known (and unfortunately more used) than the deep data scienceequivalent.

 

To learn more about deep data science,  click here. Note that unlike deep learning, deep data science is not the intersection of data science and artificial intelligence; however, the analogy between deep data science and deep learning is not completely meaningless, in the sense that both deal with automation.

Also, to discover in which contexts and applications the 40 techniques below are used, I invite you to read the following articles:

  • 21 data science systems used by Amazon to operate its business

  • 24 Uses of Statistical Modeling

Finally, when using a technique, you need to test its performance. Read this article about 11 Important Model Evaluation Techniques Everyone Should Know.

 

 

 

The 40 data science techniques

  1. Linear Regression 

  2. Logistic Regression 

  3. Jackknife Regression *

  4. Density Estimation 

  5. Confidence Interval 

  6. Test of Hypotheses 

  7. Pattern Recognition 

  8. Clustering - (aka Unsupervised Learning)

  9. Supervised Learning 

  10. Time Series 

  11. Decision Trees 

  12. Random Numbers 

  13. Monte-Carlo Simulation 

  14. Bayesian Statistics 

  15. Naive Bayes 

  16. Principal Component Analysis - (PCA)

  17. Ensembles 

  18. Neural Networks 

  19. Support Vector Machine - (SVM)

  20. Nearest Neighbors - (k-NN)

  21. Feature Selection - (aka Variable Reduction)

  22. Indexation / Cataloguing *

  23. (Geo-) Spatial Modeling 

  24. Recommendation Engine *

  25. Search Engine *

  26. Attribution Modeling *

  27. Collaborative Filtering *

  28. Rule System 

  29. Linkage Analysis 

  30. Association Rules 

  31. Scoring Engine 

  32. Segmentation 

  33. Predictive Modeling 

  34. Graphs 

  35. Deep Learning 

  36. Game Theory 

  37. Imputation 

  38. Survival Analysis 

  39. Arbitrage 

  40. Lift Modeling 

  41. Yield Optimization

  42. Cross-Validation

  43. Model Fitting

  44. Relevancy Algorithm *

  45. Experimental Design

The number of techniques is higher than 40 because we updated the article, and added additional ones.

 

 

 

Some opinions expressed in this article may be those of a guest author and not necessarily Analytikus. Staff authors are listed in https://www.datasciencecentral.com/profiles/blogs/40-techniques-used-by-data-scientists

 

Please reload

Featured Posts

Analytikus LLC y BEXTechnology S.A., nueva alianza en ciencia de datos, machine learning e Inteligencia Artificial.

May 8, 2018

1/4
Please reload

Recent Posts
Please reload

Search By Tags
Follow Us
  • Facebook Social Icon
  • Twitter Classic
  • LinkedIn Social Icon
  • YouTube Social  Icon
  • itunes