A Gentle Introduction to Statistical Computing Using R
Introduction
The Department of Mathematics, Institute of Chemical Technology, Mumbai has offered the Multidisciplinary Minor (MDM) Programme in Machine Learning and Artificial Intelligence
under the National Education Policy (NEP 2020). This material has been developed during the lectures of the course Statistical Computing (MAT 1501). This course is designed to give some fundamental statistical ideas to the students so that they can grasp deeper concepts in Machine Learning and Deep Learning courses in the upcoming semesters. All the codes in this study material were written live in the classroom to demonstrate various statistical concepts. Yes, you read that correctly—the codes were developed in real-time during the sessions. Later, some refinements were made by incorporating mathematical expressions and adding general explanations of the concepts used. Based on my experience, I find Quarto to be an exceptionally user-friendly dynamic document generation platform that supports multiple programming languages.
Additionally, I would like to emphasize that I have rarely used external packages for demonstration. Instead, I have primarily relied on basic loops and matrices to write programs. Over time, I have observed that students often focus on memorizing package names and perceive R/Python as magical software, treating them as black boxes. In this document, you will see that we performed an entire regression analysis using matrix notation before later implementing it with built-in functions.
This document is not intended to make you an expert in machine learning; rather, it aims to help you grasp fundamental concepts in Data Science. My goal is for students to understand the underlying ideas and make informed decisions when selecting appropriate algorithms in future courses, rather than blindly following R/Python instructions. Typically, these concepts are taught through PowerPoint presentations, visually appealing slides, or engaging talks. I initially considered taking a similar approach, but the students’ eagerness to explore and understand the concepts in depth inspired me to demonstrate live programming in R. Ultimately, it was the students who guided this teaching approach, and I found that it worked quite well. Codes are written live, therefore, they may not be efficient and I am sure that there will be better and smarter way to write code. But, the primarily goal is to understand the statistical ideas, not to learn programming using R.
Rationale
This course is a foundation course covering major concepts from Probability and statistical estimation theory for the Undergraduate Engineering students. Introduced concepts will be useful in understanding the concepts related to Data Science, Machine Learning, and Deep Learning having wider applications in various engineering disciplines.
Prerequisites
Basic linear algebra, differential calculus, basic probability theory, knowledge of conditional probability and Bayes theorem.
Acknowledgement
A big thank you to the undergraduate engineering students who have chosen the Multidisciplinary Minor (MDM) Degree in Machine Learning and Artificial Intelligence (ML and AI), offered by the Department of Mathematics under NEP 2020. A special shoutout to Prathamesh, Agastya, Venkatesh, Arya Rane, Ajinkya, Arya Shimpi, Arya Kale, Reenesh, and many others. Interestingly, there are three students named Arya, and in class, I often say that a question will be answered by one Arya (selected with probability 1/3 😀)In addition, special thanks to the M.Sc. in Engineering Mathematics students for their support in the classroom in expanding this material with more advanced examples tailored for the M.Sc. level. A special mention to Hiloni, Sangeeta, Vaishnavi and others for their invaluable support.
And certainly thanks to my Ph.D. student Dipali for introducing me to Quarto and resolving all issues immediately which I have been facing while compiling this document. Thanks to Riddhi for many thoughtful discussion on education practices in general, which gave me a kind of confidence to try something new. Encouragement of the students Ruqaiya, Sanyukta, Urbi, Shweta, Supriya and many others have finally convinced me to put these lectures online 💫. Thanks to all the students who have been patient listeners in my lectures and also giving great suggestions.