Beginner friendly Data science/ML/AI syllabus

Beginner friendly Data Science/ML/AI syllabus

If you search for an online course on Data Science, ML and AI , the course content/syllabus that is covered varies from a one day workshop to 4 year long B.Tech/B.E specialization in AI. In this article I have tried to create a 3-4 week long curriculum for beginners in data science/ML. The focus will be on the theoretical aspects that are helpful for Interviews, publishing and understanding the mechanism behind algorithms.

The curriculum is spread over 3 weeks (week 3 can be extended to 4 as per the comfort level of the student) , the weekly content is balanced when it comes to breadth and depth of topics. Although it depends a lot on the reader as to what extent one is planning to spend time on a single topic.

The field is vast and one might find few topics missing, but then a 3 week duration would not be a justified. The aim of this curriculum stands to make you ready for at least 2-3 capstone projects and introduce you to the field in the most detailed manner possible.

Week 1

  1. Need of automation, introduction to machine intelligence.
  2. What is a dataset? Balanced and imbalanced dataset, static vs temporal data
  3. Types of variables/features in a data set.
  4. Distributions, need of identifying distributions.
  5. Types of distributions
  6. Training, cv, testing data, difference in train and test distribution
  7. Gaussian distribution, standard normal variate, Chebyshev’s law
  8. Real life examples of various distributions (log-normal, power-law etc.)
  9. Mean, median, quantiles, variance, Kurtosis, skewness (moments around mean)
  10. PDF, CDF
  11. Central limit theorem
  12. Probability and hypothesis testing
  13. Comparing distributions, KS testing
  14. QQ plots
  15. Transforming distributions
  16. Covariance, correlations, Pearson correlation coefficient, spearman rank CC, box-cox transforms
  17. Correlation vs causation
  18. Matrix factorization, cosine similarity

Topics students need to cover: confidence interval code part: data preprocessing, eda on above topics

  1. Supervised, unsupervised and reinforcement learning definitions
  2. Feature scaling, handling missing values
  3. Outliers, RANSAC
  4. Preprocessing categorical values, label encoding, one hot encoding
  5. Regression vs classification
  6. Bias variance trade-off
  7. MSE, log-loss, performance metrics (accuracy, AUC-ROC, TPR, FPR), need for cost-function, differentiability requirements
  8. Basics of 3d geometry, hyper-planes, hypercubes, generalization to n dimensions
  9. What is a model? Interpretability of a model? Business requirements
  10. Domain Knowledge
  11. Intro to Logistic regression, sigmoid function and probability interpretation need for regularization formulation of regularization in logistic regression types of regularization, feature sparsity in L1, Hyper-parameter tuning, (manual, grid-search, random-search)

Week 2

  1. Linear regression
  2. Assumptions of linear regression
  3. MAPE, R^2
  4. Distance metrics, KNN, problems with KNN, kd trees, LSH (locality sensitive hashing)
  5. Clustering algorithms, performance metrics for un-labelled data
  6. K means, kmeans++
  8. Reachability distance, LOF (Local outlier factor)
  9. Revisiting conditional probability
  10. Bayes theorem, basics of NLP (STEMMING, STOP WORDS, BOW, TF-IDF)
  11. Naive bayes, assumptions, LOG probabilities
  12. Laplace smoothing (outlier handling in naive bayes)
  13. Naive bayes for continuous variables
  14. Dimensionality reduction
  15. Curse of dimensionality
  16. PCA
  17. Eigen vectors, eigen values, linear transformations
  18. Langrange Multipliers
  19. Solving PCA objective function
  20. SNE, T-SNE, KL-Divergence
  21. TSNE limitations
  22. Intro to Decision Trees
  23. Entropy, Gini-impurity, Pruning of trees
  24. Splitting nodes for continuous variables

Deep Learning

Week 3

  1. Neuron structure, Neural networks
  2. Perceptron
  3. MLP, weight matrices, hidden layers
  4. Gradients, learning rate, saddle points, local and global minimas,
  5. Forward propagation and backpropagation
  6. GD vs SGD
  7. Activation functions, vanishing gradient problems
  8. Parameters vs Hyper-parameters of a network
  9. Weight initialization techniques
  10. Symmetric initialization
  11. Random initialization
  12. Math behind Xavier/Glorot initialization
  13. He weights initialization techniques
  14. Contour plots, Batch-Normalization
  15. Optimizers
  16. Momentum, NAG, Ada-delta, Ada-grad, rmsprop, Adam
  17. Soft-max in multi-class classification
  18. CNN feature extraction, different layers used in cnns
  19. Channels, padding, strides
  20. Filters, kernels max, min, average pooling
  21. Transfer learning
  22. Residual networks
  23. Image segmentation (basics)
  24. Object Detection (basics), brief discussion on GANS
  25. Rnns, sequential information
  26. Vanishing gradients
  27. Sharing weights (comparison with CNN)
  28. Lstms, grus
  29. Gates in lstms
  30. Encoder-decoder models, context vector
  31. Bidirectional networks
  32. BLEU score
  33. Disadvantages of one hot, bows model, Space efficiency
  34. Semantic relation of words
  35. Representation of words as vectors-Word embeddings
  36. Word2VEC model
  37. C-BOW
  38. Skip-Gram
  39. Embedding matrix
  40. Glove vectors
  41. Attention mechanism (NLP)
  42. Local vs global attention
  43. Transformer’s architecture, self-Attention
  44. Query, key and value matrices
  45. Multi-head and masked attention Intro to BERT (Encoder only stacks) GPT-2, GPT-3 (Decoder-only stacks)