Getting started
It can be easy to feel overwhelmed when beginning with research. There’s so much out there to learn about, so many possible directions. Sometimes it’s even difficult to figure out which questions to ask.
The good news is that the best place to start is by learning the skills you need to do the work. The more comfortable you are with these skills, the easier it will become to narrow down your focus, to ask the right questions, and — importantly — to do the work to answer those questions.
Here are some basic skills you should become proficient in before starting on any independent research in this group.
The good news is that the best place to start is by learning the skills you need to do the work. The more comfortable you are with these skills, the easier it will become to narrow down your focus, to ask the right questions, and — importantly — to do the work to answer those questions.
Here are some basic skills you should become proficient in before starting on any independent research in this group.
Learn how to read an academic paper
Reading a paper is not an easy task. Even for people who have been working in research for years, it can be challenging. Research is very specialized, which means that reading something even just a little outside your area of expertise can require looking up a lot of words (or trying to infer an alternate meaning to some more familiar ones). So don’t be intimidated if this process doesn’t come to you naturally!
The University of Illinois Chicago (UIC) has put together a very helpful infographic for how to approach reading a research paper. Phase one: skim the paper Don’t try to understand every single aspect of the paper the first time through. Read over it, maybe jotting down a few questions as they come up, but the goal of this process is just to “survey” the article, as UIC puts it. You want to get a sense for the overall context, methods, and results without understanding the details. If you’re new to reading papers, you may actually want to do this step a few time, seeing what new questions come up for you with each quick read through. Phase two: dig deeper You may want to pick just a few areas to explore in more detail, as these papers can be dense, and you may only be interested in the method or the system, but not both. Always: keep track of what you read You may return to this paper later, or it may not end up being really useful. Either way, you should keep track of what you've read. Organize your reading in whatever system makes most sense to you, but be sure to include the title, a link to the paper (the DOI), and whatever notes are going to help you remember what you got out of it when you read it. |
Get set up on Jupyter Hub
This process is fairly straightforward. Casey will provide your Smith login info to CATS, and they will add you to the list of users on JupyterHub. Once that’s done, you should be able to access your account by typing "jupyterhub.smith.edu" into your browser window.
Log in with your smith credentials and you'll be ready to get started. If you are accessing JupyterHub from an off-campus location, you must perform an additional step. You need to install the Pulse Secure VPN (instructions here) and you need to log in to the VPN with your Smith credentials before accessing JupyterHub. Once you've signed in (and done the Duo push), you should be able to access the server as normal.
Here's a useful tutorial on JupyterHub you can use to get familiar with how it works.
Log in with your smith credentials and you'll be ready to get started. If you are accessing JupyterHub from an off-campus location, you must perform an additional step. You need to install the Pulse Secure VPN (instructions here) and you need to log in to the VPN with your Smith credentials before accessing JupyterHub. Once you've signed in (and done the Duo push), you should be able to access the server as normal.
Here's a useful tutorial on JupyterHub you can use to get familiar with how it works.
Learn some Python
Free online courses
There are lots of resources online that can help get you started with Python.
For all of these, it is highly recommended that you use Jupyter Notebooks to take notes, so you have sample code you can easily reference when working on projects. You will retain more when you copy what is done on the screen instead of just watching it, and then you can go back and make changes and experiment with the code.
Data Science Tools
Once you have some familiarity with Python (while loops, for loops, arrays, basic math, and simple plotting), you should learn some tools for working with data. We have a lot of data to process in this kind of research, so the bulk of our work is spent using these tools.
I’ve created some Jupyter notebooks to help you get familiar with Pandas. You can find them in this Google Drive folder. The subfolders are numbered in the order you should go through them. For each, download the notebook and the data file(s) and upload them into JupyterHub and start exploring. Reach out with any questions you have as you go through this!
There are lots of resources online that can help get you started with Python.
For all of these, it is highly recommended that you use Jupyter Notebooks to take notes, so you have sample code you can easily reference when working on projects. You will retain more when you copy what is done on the screen instead of just watching it, and then you can go back and make changes and experiment with the code.
- Coursera: Python for Everyone
- Coursera: Introduction to Data Science in Python
- Problem Solving with Python online text
Data Science Tools
Once you have some familiarity with Python (while loops, for loops, arrays, basic math, and simple plotting), you should learn some tools for working with data. We have a lot of data to process in this kind of research, so the bulk of our work is spent using these tools.
I’ve created some Jupyter notebooks to help you get familiar with Pandas. You can find them in this Google Drive folder. The subfolders are numbered in the order you should go through them. For each, download the notebook and the data file(s) and upload them into JupyterHub and start exploring. Reach out with any questions you have as you go through this!