Key Points
Before we Start |
|
Getting to know the data |
|
Exploring with summary statistics |
|
Joining data |
|
Boxplots and linear regressions |
|
What is the next step? |
|
Glossary
Cheat sheet of functions used in the lessons
Lesson 1 – Introduction to R
sqrt()
# calculate the square rootround()
# round a numberargs()
# find what arguments a function takeslength()
# how many elements are in a particular vectorclass()
# the class (the type of element) of an objectstr()
# an overview of the object and the elements it containstypeof
# determines the (R internal) type or storage mode of any objectc()
# create vector; add elements to vector- ` [ ] ` # extract and subset vector
%in%
# to test if a value is found in a vectoris.na()
# test if there are missing valuesna.omit()
# Returns the object with incomplete cases removedcomplete.cases()
# elements which are complete cases
Lesson 2 – Starting with Data
download.file()
# download files from the internet to your computerread_csv()
# load CSV file into R memoryhead()
# shows the first 6 rowsview()
# invoke a spreadsheet-style data viewerread_delim()
# load a file in table format into R memorystr()
# check structure of the object and information about the class, length and content of each columndim()
# check dimension of data framenrow()
# returns the number of rowsncol()
# returns the number of columnstail()
# shows the last 6 rowsnames()
# returns the column names (synonym of colnames() for data frame objects)rownames()
# returns the row namessummary()
# summary statistics for each columnglimpse
# likestr()
applied to a data frame but tries to show as much data as possiblefactor()
# create factorslevels()
# check levels of a factornlevels()
# check number of levels of a factoras.character()
# convert an object to a character vectoras.numeric()
# convert an object to a numeric vectoras.numeric(as.character(x))
# convert factors where the levels appear as characters to a numeric vectoras.numeric(levels(x))[x]
# convert factors where the levels appear as numbers to a numeric vectorplot()
# plot an objectaddNA()
# convert NA into a factor leveldata.frame()
# create a data.frame objectymd()
# convert a vector representing year, month, and day to a Date vectorpaste()
# concatenate vectors after converting to character
Lesson 3 – Data Wrangling with dplyr and tidyr
str()
# check structure of the object and information about the class, length and content of each columnview()
# invoke a spreadsheet-style data viewerselect()
# select columns of a data framefilter()
# allows you to select a subset of rows in a data frame%>%
# pipes to select and filter at the same timemutate()
# create new columns based on the values in existing columnshead()
# shows the first 6 rowsgroup_by()
# split the data into groups, apply some analysis to each group, and then combine the results.summarize()
# collapses each group into a single-row summary of that groupmean()
# calculate the mean value of a vector!is.na()
# test if there are no missing valuesprint()
# print values to the consolemin()
# return the minimum value of a vectorarrange()
# arrange rows by variablesdesc()
# transform a vector into a format that will be sorted in descending ordercount()
# counts the total number of records for each categorypivot_wider()
# reshape a data frame by a key-value pair across multiple columnspivot_longer()
# reshape a data frame by collapsing into a key-value pairreplace_na()
# Replace NAs with specified valuesn_distinct()
# get a count of unique valueswrite_csv()
# save to a csv formatted file
Lesson 4 – Data Visualization with ggplot2
read_csv()
# load a csv formatted file into R memoryggplot2(data= , aes(x= , y= )) + geom_point( ) + facet_wrap () + theme_bw() + theme()
# skeleton for creating plot layersaes()
# by selecting the variables to be plotted and the variables to define the presentation such as plotting size, shape color, etc.geom_
# graphical representation of the data in the plot (points, lines, bars). To add a geom to the plot use + operatorfacet_wrap()
# allows to split one plot into multiple plots based on a factor included in the datasetlabs()
# set labels to plottheme_bw()
# set the background to whitetheme()
# used to locally modify one or more theme elements in a specific ggplot object+
# arrange ggplots horizontally/
# arrange ggplots verticallyplot_layout()
# set width and height of individual plots in a patchwork of plotsggsave()
# save a ggplot
Lesson 5 – Processing JSON data
read_json()
# load json object to an R object