At Øptimus, we are bringing the state of the art predictive analytics to model individual behaviour and deliver data-driven commercial marketing and research to our clients. I work in the data science team to refine the predictive engine behind several of our products and projects.
I am a part of the core group of data scientists for the Øptimus Election Modeling, among other things. I design the modeling framework and implement the machine learning algorithms for it. One of our strengths has been advising political campaigns to make better decisions using a data driven approach. For the first time, we are building a public facing commercial product in the form of a forecasting platform to determine the probability of a GOP victory in individual House and Senate elections.
One of our core products is our modeling engine which produces predictions at an individual level about a given issue. These individual level predictions are used to generate universes for targeted advertisements. I focus on improving the predictive performance of these models to accurately predict individual behaviors. One approach I use to improve our techniques is to find areas that were predicted incorrectly, fine-tune features, explore patterns in our data to engineer features, design and conduct experiments to tune modeling pipelines, and define metrics to evaluate persuasion models.
I love my job as it allows me to wear several hats that make up data scientist’s profile – be it data munging, validating data, exploring/analyzing data, running models, or testing business acumen. In the past year, I have worked on four different projects, each of them allowing me to hone these skills further.
For example, for the Øptimus Election Modeling project, we have spent innumerable hours validating data, iterating over modeling pipelines, and coming up with ways to communicate our results to a politically-inclined audience who may not necessarily care to understand some of the data scraping/statistical nature of the work. This project has been fast paced and very exciting to work on.
Another project I enjoyed took on more of a business role in a commercial setting, which can be atypical for data-scientists. Using a survey dataset from our client, I was able to define a viable business solution, which allowed the client to leverage our modeling techniques and boost their business.
I hold a Master’s degree in Applied Mathematics from Iowa State University (ISU) and a BS-MS dual degree in Physics from the Indian Institute of Science Education and Research (IISER), Pune.
Prior to joining Øptimus, I was involved in several academic research projects. As an undergraduate student, I worked on projects in physics and chemistry with primary focus on mathematical modeling. I was curious about economics, which led me to use game theory to model congestion and extreme events in traffic networks. Additionally, I simulated the prisoner’s dilemma on a social network and studied patterns in evolution of behavior of players. Ultimately, these projects led me to join the Applied Mathematics PhD program at Iowa State University. During this time, I developed a taste for statistics. I spent summer of 2016 and 2017 at workshops aimed at exposing graduate students to challenging and exciting real-world problems arising in industrial and government laboratory research.
Additionally, I have worked with the Environmental Protection Agency to build a model that predicts the amount of pollutant over the Pacific Ocean by fusing surface and satellite-derived PM observations. This turned out to be a turning point in my life. I realized I was thrilled to be working in an industrial environment on real world problems where I can apply my mathematical acumen but also make relatively immediate impact, which I had found lacking in my academic research work. I decided then to graduate with a Masters degree in Applied Mathematics instead and take up a Data Science Fellow role at Øptimus.
I think one of the most common misconceptions is the belief that being able to run a random forest or XGBoost makes one a data scientist. In my experience, a large portion of time is spent defining a specific problem, developing and testing hypotheses, gathering/cleaning/exploring data, scaling the process, and developing tools to communicate the results to the target audience. Developing and using a machine learning algorithm is an important skill, but still only a small part of the process.
In the real world, no one will hand you a clean dataset. Get your hands dirty with data – gather, clean, and explore the data until you start dreaming about it! Strong fundamentals in Computer Science, Statistics, and Mathematics is a must. This will make it easier to understand the nuances and inner workings of machine learning algorithms.
Domain knowledge is also very important. You don’t need to be an expert in the field, however understanding the context is essential. It will better help you understand the model, notice potential biases in the model, and improve engineering features that will improve model performance.
In the end, you should remember that your potential client may not understand the technical aspect of your work, but rather care if the model is logical and trustworthy. It is important to be able to facilitate communication with a non-technical audience.
To be honest, I find myself overwhelmed yet excited by the continuous stream of new technology. I am regular reader of the Data Science Weekly. I love their newsletter, articles, and training resources. Also, my colleagues are a great way to learn about new technologies.
I also keep up with research papers and blogs published by various leading companies such as AirBnb, Google, and Facebook to see how they implement new tech.