Course Materials¶
Course website and communication¶
We will use the course’s Pitt Canvas as the primary place to host course-related information. All class materials including discussions and assignments are posted on Pitt Canvas.
The course is designed to be completed on an asynchronous basis, and there are no mandatory weekly meetings. The instructor is available for consultations through Zoom during office hours and by appointment (Spring 2026: Monday & Friday 12:00pm-1pm + by appointment).
Info
Use Pitt email, Zoom, and Canvas for course-related communication.
Books & Reading Materials¶
No Textbook Purchase Required
No purchase of textbook is required for this course.
There is no need to buy a textbook for this course. Links to all reading materials and programming scripts for the course are posted in Canvas. The instructor has curated the materials for this course based on multiple books, academic papers, teaching cases, and relevant online tutorials posted by both industry practitioners and academic scholars.
Suggested primary references:¶
Basic Python and R:
- The Python Tutorial and the Python for Everybody webpage are excellent resources for
- Introduction to R: A programming Environment for Data Analysis and Graphics
- Think Python, second edition, by Allen Downey. A PDF version of the book is available here.
- Automating the Boring Stuff with Python by Al Sweigart
Python and R ecosystem for data analysis:
- Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython, third edition, by Wes McKinney.
- Natural Language Processing with Python: Analyzing Text with the Natural Language Toolkit, http://www.nltk.org/book/
- R for Data Science, https://r4ds.had.co.nz/
- Text Mining with R: A Tidy Approach, https://www.tidytextmining.com/
- An introduction to Statistical Learning, https://www.statlearning.com/
- Data Mining for Business Analytics: Concepts, techniques, and applications in Python, Galit Shmueli et al. 2019.
- Data Mining for Business Analytics: Concepts, techniques, and applications in R, Galit Shmueli et al. 2017.
Software Tools 🧑💻¶
- For Python programming, we’ll use Google Colab. This free tool uses cloud computing that does not need any local installation of software packages. Students will need a web browser with internet connection. Log in using a personal Google account or using Pitt credentials.
- If local installation on student computers is preferred, the free Anaconda distribution (Python 3.11 version or higher) is recommended.
- For R programming, we’ll use the free and open-source R distribution and RStudio (free, desktop edition) for local installation. A free cloud-hosted option is available through Posit Cloud.
- NotebookLM is a new generative AI powered tool designed to help users understand complex information. It provides an integrated space for reading, thinking, asking questions, and writing. There are two graded assignments that need access to this service. The unpaid version of the service will fit our needs.