This lesson is being piloted (Beta version)

R EDA: Instructor Notes

Dataset

Datasættet er ekstraheret fra biblioteket nycflights13. Det er at foretrække at de studerende har downloadet filen på forhånd - den er ret stor.

Vi har oplevet problemer hvor studerende har brugt posit.cloud - og hvor download og åbning af filen har fået rstudio til at crashe. I det tilfælde må det anbefales at den studerende bruger nycflights13 pakken direkte, i stedet for at downloade filen.

RStudio and Multiple R Installs

Some learners may have previous R installations. On Mac, if a new install is performed, the learner’s system will create a symbolic link, pointing to the new install as ‘Current.’ Sometimes this process does not occur, and, even though a new R is installed and can be accessed via the R console, RStudio does not find it. The net result of this is that the learner’s RStudio will be running an older R install. This will cause package installations to fail. This can be fixed at the terminal. First, check for the appropriate R installation in the library:

ls -l /Library/Frameworks/R.framework/Versions/

We are currently using R >=3.2. If it isn’t there, they will need to install it. If it is present, you will need to set the symbolic link to Current to point to the R >=3.2 directory:

ln -s /Library/Frameworks/R.framework/Versions/3.x.y /Library/Frameworks/R.framework/Version/Current

Then restart RStudio.

Narrative

Before we start

Intro to R

Fast forward to today, there really are only a few mechanical reasons why <- is preferred over =. Assignment ranks higher in operator precedence than =. If you wish to perform variable assignment inside a function, <- is the only option.

Starting with data

The two main goals for this lessons are:

Manipulating data with dplyr

Error: Can’t rename columns that don’t exist.
x Column NA doesn’t exist.

Make sure you have read in the CSV file with the option that interprets the "NULL" string as NA, like so:

interviews <- read_csv("data/SAFI_clean.csv", na = "NULL")

Visualizing data with ggplot2

Technical Tips and Tricks

Show how to use the ‘zoom’ button to blow up graphs without constantly resizing windows.

Sometimes a package will not install. You can try a different CRAN mirror:

Alternatively you can go to CRAN and download the package and install from ZIP file:

It is important that R, and the R packages be installed locally, not on a network drive. If a learner is using a machine with multiple users where their account is not based locally this can create a variety of issues (this often happens on university computers). Hopefully the learner will realize these issues beforehand, but depending on the machine and how the IT folks that service the computer have things set up, it may be very difficult to impossible to make R work without their help.

If learners are having issues with one package, they may have issues with another. It’s often easier to make sure they have all the needed packages installed at one time, rather than deal with these issues over and over.

| character on Spanish keyboards: The Spanish Mac keyboard does not have a | key. This character can be created using:

`alt` + `1`

Other Resources

If you encounter a problem during a workshop, feel free to contact the maintainers by email or open an issue.

For a more in-depth coverage of topics of the workshops, you may want to read “R for Data Science” by Hadley Wickham and Garrett Grolemund.