Stat 240 - Intro to Data Science - Spring 2017

 

Dr. David Campbell


Week 1 Day 1 notes:

Data acquisition and cleaning


COURSE CONTENT:


R:  Introduction to RStudio

Data types


SQL:  Using RSQLite

Performing operations on the database

Temporary databases


Twitter:  Accessing Twitter data from the API

Sentiment Analysis

Dealing with time

Dealing with text

Regular expressions


Webscraping: Dealing with html code

writing a web crawler to parse html

extracting tables from html

JSON formatted data

Using an API to scrape JSON data


Statistical visualizations:

Density plots

Histograms

Barcharts

Boxplots

Wordclouds

Conditional plots

Scatter plots


Statistical tools:

Density estimation

Moving average

Data summaries




© Dave Campbell 2007-2017