INFO634
|
Revised |
---|
Rick Watson
Email to set up a chat or video connection
Class: Wednesday 1-4
Provides students with entry level knowledge of data science, along with experience of diverse methods and technologies related to common aspects of data science.
The course syllabus is a general plan for the course; deviations announced to the class by the instructor may be necessary.
Students completing the course will have foundational skills in the use of common data science tools, including:
Wickham, H., & Grolemund, G. (2017). R for data science: O’Reilly.
Great R packages for data import, wrangling and visualization
The due time is 11:59pm on the Friday after class.
The class will read a variety of recent articles on topics on data science and related issues. I will randomly call students in class to identify the 2-3 key points made by the article. If you not prepared, you could lose up to the full 5 points allocated to readings as part of the course grade.
Assignment will be done in pairs because of the demonstrated value of pair learning. You are expected to help each other learn R. Please notify the instructor by 11:59pm of 20/7 of the composition of your pair.
Note: A watt is the unit of power whereas a kilowatt-hour (kWh) is the unit of energy. You can compare a watt to how fast water is flowing out of a water pipe. A kWh is equivalent to a power consumption of 1,000 watts for 1 hour.
Using the merged file created in the previous assignment, do the following
A file contains details of CO2 emissions per capita for the four largest economies in the Americas. Use Exploratory to read the file, convert it into a format suitable for use with R. (1) Report the average CO2 per capita for each country in descending order, and (2) prepare a bar chart showing the average CO2 per capita for each country. Create a Word document with your results.
Read the temperature data for Central Park. Compute the average temperature for each year, and create a scatter graph with a linear regression line. Create a Word document with your results.
As there are 50 students in the class, we will have around 12 teams of 4 members for the project. . Please notify the instructor by 11:59pm of 20/7 of the composition of your group. Pairs can combine to create a group of four.
Identify a problem and use R to explore data related to the problem and prepare a report on your analysis and related recommendations. You should exploresome local available open data sets, such as https://www.data.govt.nz/, http://opendata.canterburymaps.govt.nz, and https://opendata.ccc.govt.nz/public-portal/. Please discuss your proposed project with the instructor before starting major work on it. Presentations will be 10 minutes each.
Team | Members |
---|---|
1 | Athira Nair Amit Shah Hakan Gulliksen Megha Malhotra |
2 | Jabir Singh Baith |
3 | Jing Wu Nicole Pearks Yuting Yang Zhexi Liang Weichen Jiang |
4 | Nimmy Lloyd Pooja Chindalur Savika Gunasinghe Sujith Sampath Kumar |
5 | Romalee Amolic Prashant Islur Piyush Rastogi Jonathan Munro |
6 | Vee Chalermglin Luke Chune Byunggu Kang Hiroyuki Nezu |
7 | Vaisakh Radhadkrishnan Ankit Jaiswal |
8 | Sam Davidson Aileen Medina Pradeep Raja Mengqi (Peter) Shi |
9 | Daniel Bentall
Blake List
Ben Faulks
Sid Bhatnagar |
10 | |
11 | |
12 |
Item | Points |
---|---|
Topic assignments | 20 |
Research assignment | 20 |
Project | 30 |
Exam | 25 |
Articles | 5 |
Total | 100 |
If you are unable to complete an assignment on time, please advise the instructor as soon as possible so that alternative arrangements can be made. |
* These readings are available via Moodle