map(python, data)
Introduction
This site contains the course notes for CAPP 30122: Computer Science with Applications 2.
These notes were originally written by James Turk for CAPP 30122 at The University of Chicago.
This course assumes that you have basic familiarity with Python and the concepts introduced in CAPP 30121. It aims to take those skills and help you connect them to the real world by obtaining, cleaning, and analyzing data.
Goals
More specifically, the course has three distinct goals:
1. Expand your Python & general programming knowledge.
In 1 Python Packages, 2 Python Tools, 3 Debugging, and 4 Testing we start with some practical considerations to prepare to write larger programs in Python.
After spending some time with data, we’ll explore some core computer science topics: 8 Complexity & Performance, 9 Data Structures, 10 Object-Oriented Programming, and 11 Graphs.
Finally, 16 Data Pipelines and 17 Architecture will return to the big picture: writing larger applications and composing multiple programs.
2. Introduce data engineering and analysis techniques.
A major theme of this course is connecting your programming skills to real-world data:
In 5 Data Formats, 6 HTTP & APIs, and 7 Web Scraping, we will explore obtaining real data from the web.
12 Geospatial Data, 13 Record Linkage, 14 Regular Expressions, and 15 Data Visualization will look at challenges common in working with data.
3. Gain experience working on a team.
When writing code as part of a team, you realize just how much readability & design matter. We’ll introduce programming concepts that help preserve isolation between components, and tools that help enforce code quality.
How this differs from 30121
This course is less linear than 30121, and covers a much wider variety of material.
As you learn to program, or a new language, almost every topic has to build on the last. Occasionally we’ll spend a few lectures on a topic, but if you find a topic is not resonating with you, do not worry– we will be on another topic in a week or so.
This means you can decide to come back & revisit a topic like web scraping at your own convenience if it does not land the first time. It will not hold up your ability to learn the remaining material.
It also means that we cannot cover every topic in depth, and you will need to rely on outside resources. This is a feature not a bug. As you will see, applied programming means using lots of third-party libraries, and it would not be feasible (nor desirable) to have someone lecture to you about the ins & outs of libraries that are evolving every year.
Perhaps we add a bonus fourth goal:
- Teach you to expand your programming skills in a self-guided way.
This after all, is what you will inevitably need to do in any job where programming is involved. Any team will be using some bespoke library, cutting-edge framework, or esoteric domain-specific language.1 Learning to navigate and read documentation is an relevant skill.
Why we’re here
Changing the world, bit by bit.
This is the class where we are going to really start connecting your code to the real world. This is an exciting transformation in your ability to create change.
Throughout it all, we will keep our overall goal in mind.
We are here to make things better, with code as a tool.
map(python, data) by James Turk is licensed Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International
If you find these notes useful, please drop me a note!
Domain-specific languages are small languages for a specific purpose such as matching text patterns or identifying content in a webpage. We will see a few in this course.↩︎