Nhảy tới nội dung

Roadmap Data Science

1. Linear Algebra

  • Vectors
  • Matrices
  • Transpose of a matrix
  • Inverse of a matrix
  • Determinant of a matrix
  • Trace of a matrix
  • Dot product
  • Eigenvalues
  • Eigenvectors

2. Statistics

  • Analyzing categorical data
  • Displaying and comparing quantitative data
  • Summarizing quantitative data
  • Modeling data distributions
  • Exploring bivariate numerical data
  • Study design
  • Counting, permutations, and combinations
  • Sampling distributions
  • Gaussian distribution
  • Confidence intervals
  • Significance tests (hypothesis testing)

3. Probability

  • Basic theoretical probability
  • Probability using sample spaces
  • Basic set operations
  • Experimental probability
  • Addition rule
  • Multiplication rule for independent events
  • Multiplication rule for dependent events
  • Conditional probability and independence

4. Calculus and Optimization

  • Derivative
  • Find minimum, maximum
  • Multivariable Functions
  • Partial differentiation
  • Exponential function, Exponential decay
  • Logarithmic Functions
  • Distance Measurement

5. Computer Sicence

  • List, Stack, Queue
  • Hash function, Hash table
  • Sorting: Intersection Sort, Selection Sort, Bubble Sort Quicksort
  • Binary Tree, Trie
  • Binary Search
  • Recursion

6. Programming & Deployment

  • Variable
  • If Else
  • Loop
  • Operator
  • Function
  • String
  • Unit test

7. Database & Bigdata

  • MySQL
  • Postgresql
  • MSServer

8. Machine learning

  • Machine Learning: How to use API, library
  • Overfitting, Underfitting
  • Regularization
  • Simple Evaluation Metrics: MAE, MSE, RMSE, MAPE. Confusion Matrix, Precision, Recall, Accuracy, F1-Score, ROC-AUC
  • Imbalanced Data Handling
  • Loss function
  • Missing value Handling
  • Feature Engineering, Feature Selection
  • Cross validation

9. Computer Vision

  • Image Processing: OpenCV
  • Convolution, Maxpooling
  • Histogram of oriented gradients (HOG)
  • Image Classification(SVM)

10. Natural Language Processing

  • Text Processing, Regex
  • Tokenizer, Stemming, Lemmatization
  • N-Grams
  • Parts of Speech Tag (POS Tag)
  • Language Model, Probability model
  • TF-IDF, BM25
  • Text Classification (SVM, Logistic Regression, Naive Bayes)

11. Deep Learning

  • Batch size
  • Tensor, Cuda
  • Dropout
  • Normalization, Regularization
  • Vanishing & Exploding Gradients
  • Activation function : Sigmoid, ReLU, SeLU, Tanh
  • Backpropagation Algorithm

12. Time Series Analysis

  • Definition
  • Time Series Patterns: Trend, Seasonal, Cyclic
  • Time Series Decomposition: Trend, Seasonal, Residual
  • Stationary Time Series
  • Autocorrelation,Partial Auto-Correlation Autoregression
  • Smoothing Time Series
  • ARMA, ARIMA, SARIMA, GARCH, ARCH

13. Recommender System

  • Not required

Follow Fanpage của mình để nhận được các bài viết mới nhất nhé!!
https://www.facebook.com/datasciencedances/