Tour of the Terminal: Using Unix or Mac OS X Command-Line


Dawn Koffman is a Statistical Programmer at the Office of Population Research at Princeton University. She earned an MS in Computer Science from University of Wisconsin-Madison, and an MPH in Epidemiology and Biostatistics from UMDNJ and Rutgers University.

5/05/2014 from 9:30 AM to 12:00 PM ~ 300 Wallace Hall
You'll be a better data scientist if you're comfortable working in a Unix (or Linux or Mac OS X) command-line environment and are able to make use of command-line tools. For example, as we all know, most data needs to be cleaned, and often times reshaped and combined with other data before it can be easily viewed or used to obtain descriptive statistics or estimate multivariate models. Command-line tools provide flexible and efficient ways to handle these cleaning and data management tasks, regardless of how big the data is. The specific topics included in the Tour of the Terminal workshop are: using an interactive shell; file system structure, pathnames and permissions; pipelines, sequential execution, background execution and i/o redirection; emacs text editor; commands, options and arguments; building shell scipts; regular expressions; and the Unix stream editor (sed). So consider taking a break from the point and click interface and enhance your data science toolset.
Those with little or no experience working with a Unix/Linux or Mac OS X command-line environment.
Lecture, discussion and demonstration are used to convey concepts and illustrate examples.
It is recommended that attendees EITHER bring a Mac laptop OR obtain an account on a Princeton University Linux computer such as Nobel and bring a Windows or Mac Laptop that can be used to access the Linux computer.
Presentation (PDF)
Data (gz)