NAOC Workshop: Before You Arrive

last updated: 12 August 2016 15:30

In order to help us dive right into the workshop’s presentations and exercise, we are asking you to have your laptops ready, with all of the necessary software and sample data for running the example scripts already loaded onto your computer.  There are also other optional software packages, and workshop materials that will be available for download before the workshop begins.  This page described these downloadables, and where to find them.

Computer Software to Install

To get the most out of the workshop, we suggest that you have installed on your computer the software that we will be using during the workshop.  Some of this software is required to run the examples yourself (flagged as “required”), while other downloads are optional.  Here’s the list:

  • R (required) – Much of the computer script code that we will be using during the workshop, and all of the code used during demonstrations of applications will be for R. So, if you do not have R installed, please download and install a version for your computer.  You will want to install the “base” software, for example from this website: http://lib.stat.cmu.edu/R/CRAN/ or other sites listed here: https://cran.r-project.org/mirrors.html .  If you have an older version of R, for the sake of insuring compatibility run the following command within R to update various packages: type “update.packages()” and follow the instructions, confirming with a “y” that you want to update any versions of R components (“packages” or “libraries”) for which newer versions are available.
  • R Add-On (“Contributed”) packages (required) – When you install R, you are installing “just” the basic software, but people all over the world are creating add-on packages that are useful if not essentially for some purposes…including several parts of the workshop. Once the base R program has been installed, you can install these add-on packages (also called “libraries”) from within R itself.  To install all of the additional packages that we will be using, start an R session and type in the following command as a single line (unfortunately copy and paste from this webpage does not appear to work) into the window after the “>” prompt and press your <ENTER> key:
    install.packages(“raster”, “maptools”, “fields”, “maps”, “randomForest”, “PresenceAbsence”, “mgcv”, “plyr”, “dplyr”, “rgdal”, “unmarked”, “MuMIn”)
    You may will be prompted to select a location (a “CRAN Archive”) from which to download the packages; select a location relatively near to you, but the specific location doesn’t matter.  If you are already using R and have some of these packages already installed, don’t worry, as no harm can come of telling R to install a package already installed.
  • Cygwin (required for Windows computers only!) – In order to take the massive bundles in which eBird data comes and whittle them down to manageably small units, and to do so quickly and on computers with minimal amounts of available memory, we will be using a few command-line tools that come from the distant ancient days of UNIX, before GUIs were invented (but well after the dinosaurs went extinct…except for birds, that is). Macs and Linux computers already have these tools, but people using Windows OS computers will need to install some additional software from here: https://cygwin.com/install.html .  You almost certainly want to install the 64-bit version of Cygwin unless your laptop is particularly ancient.  You will only need to install the “minimal base packages” described in the installation notes.  There are probably other ways for Windows OS users to get the same set of UNIX tools, but we know from experience that Cygwin’s version just works (it’s one of the first bits of software that I’ll install on a new computer, and has been for well over a decade now).
  • RStudio (optional) – RStudio is a helper program that works with R, providing a nice environment in which to edit R scripts, run the scripts, and look at output. Using RStudio is not essential, but we will be running all demonstrations from within RStudio.  Basically, we like it, and we think that you’d like using it too, over working with the “raw” version of R.  You can download the free “RStudio Desktop” from here: https://www.rstudio.com/products/rstudio/download2/.  I believe that you will need to have R installed on your computer before installing RStudio, or RStudio will not install correctly.

Workshop Material Downloads

In addition to the software, you will need data to be processed by the software.  You should have the data that we will use in our examples already on your computer when you arrive.  For those that are interested, we also provide instructions on how to access and download the full eBird data products.

  • eBird sample data (required) – We have set up a folder of workshop materials that you can download from this link.  All of the data are located in the “data” subfolder; click on the name to be taken there.  You will see a “Download” button on that top of that page, for your use.
  • Workshop scripts and presentations (required) – All of the other materials — the latest versions of PowerPoint slides and files containing computer scripts — are available at the same link as the eBird sample data.  These materials are in separate sub-folders for the various topics of the workshop.
  • eBird data products (optional) – For convenience, we will provide subsets of eBird data for all of the exercises used in the workshop (stay tuned for instructions on downloading these in a follow-up message).  However, you may be interested in downloading the full data sets before the workshop to get a feel for the data. This page provides instructions on how to download the publicly available eBird data: http://help.ebird.org/customer/portal/articles/1010524-can-i-download-raw-data-from-ebird-  and here is the web page on which you would fill out your request for access: http://ebird.org/ebird/data/download.  As part of the data request process eBird will ask you to fill out a short form describing your intended use cases; we use this information both to understand how people want to use the data, and for obtaining funding to maintain eBird.  Make sure to mention that you are part of the NAOC workshop. Forms will be reviewed daily.   As a word of caution: some of these data sets are large. E.g., you may need up to 250GB free on your hard drive to expand and unzip the “EBD World” dataset.