1. What is it you say you do here?
I work in the Data Science department, but if you ask ten different data scientists what a data scientist actually does, you’ll get ten different answers.
The best way to describe what I do is that I simply use data to solve problems. These problems might come from clients needing answers to specific questions, from the Øptimus partners saying we need to find a better way to do something, or just roadblocks that I’ve repeatedly encountered that I want to work around.
The era of “Mad Men” – you know, where we rely on intuition and our “gut” – is over. We have the tools to make data-driven decisions. We have the knowledge to make accurate and precise predictions, while not overstating or being falsely confident in our conclusions.
2. So, you studied statistics. I didn’t like that class very much in college. How do you actually use it in your job?
I like statistics because I’m mathematically inclined and I wanted to focus on real-world applications. The brand of statistics that I use most frequently, in broad strokes, is using information we get from samples to make inferences about a larger population.
For example, when we conduct a survey of voters in the U.S. and see that 27% of people in our sample support candidate X, we understand that that doesn’t mean exactly 27% of all voters will support candidate X – but we can quantify that uncertainty and get a pretty good idea of what the larger population’s voting behavior will be.
We also take these samples to build models. Models are fun because we can estimate the extent to which certain characteristics (i.e. age, gender, ethnicity, income) effect characteristics we’re interested in (i.e. favorability toward a certain candidate, likelihood of voting in an election). Statistics allow us to gain far deeper insights into voting behavior than most people can imagine.
3. If you could do anything else in the world, what would you do?
Hands down, I’d be a professor. I had the opportunity to teach in grad school at Ohio State and be a TA at both my undergrad (Franklin College) and at OSU. I find statistics to be so powerful and interesting and I love conveying that to other people in a meaningful way.
All too often, professors focus on antiquated methods and esoteric research and don’t make material relevant to students. Like, I get it – math isn’t fun for most people and statistics seems like math’s weird cousin, but using real world examples like waiting times for Uber, predicting sales of beers at a bar, or estimating the proportion of voters who will vote for a candidate is something that was relevant to my students and therefore made statistics just a bit more meaningful.
Once people realize that it’s not just random letters and squiggles on a board, but that statistics can actually describe the world around us and predict what may happen, it becomes much more palatable and – dare I say it? – cool.
4. Best tip for aspiring data scientists?
I have two:
1. Read. Read everything. Read articles. Read blogs. Read textbooks, tweets, anything that might expose you to how data science is used in the real world, how new methods are being developed, how you can solve a problem in a better way.
2. Recognize that data science is a combination of statistics, computer science, and contextual knowledge. All three are very important. You’ll naturally gravitate toward one piece, but you need to be proficient in all three to be a successful data scientist. Knowing the background of a problem is useless without understanding how data science fits into the picture. Being a boss at coding in R and Python doesn’t do you a bit of good if you don’t understand what statistical methods do and don’t apply. Understanding theoretical statistics is entirely irrelevant if you can only do things by pencil and paper and can’t scale things efficiently.