Statistical Foundations for Data Science and Machine Learning
This course will serve as introduction to basic statistical principles that are often used by data scientists and applied statisticians. Many of the concepts will be reinforced by using the statistical programming language R, one of the two most popular languages for Data Science. The intent of this course is to expose students to common statistical issues and teach them how to avoid statistical fallacies. By the end of the course, students will have a fundamental understanding of many of the statistical principles that underlie machine learning and data science. Topics to be covered: Basic Probability, Expected Value, Variance, Point Estimates, Introduction to R; Further Probability, Central Limit Theorem, Law of Large Numbers, Hypothesis Testing; P-Values, Multiple Comparisons, Bonferroni Adjustment; Introduction to Regression, Prediction, Hypothesis Testing for Regression; Model Selection for Regression, Backwards/Forwards, R^2 and other selection criteria.