Fast growing, data driven companies have different hues of data and hence there are different flavors of data scientists than you might think. There are data scientists working on improving user experience, who don’t only rely on data but also behavioral traits of users as uberdata suggests. And then there are data scientists working on image recognition who may employ advanced machine learning techniques to distinguish a cat from a human purely on the basis of algorithms.
So in a large company that deals with a variety of data, there is no standard job description for a data scientist. The roles may vary and so do the required skill set. Here are some areas in a data driven company where data scientists are involved and the skill set that each job requires:
The Growth DS
The Growth DSs partner with the Marketing team and the Product team to identify ways to drive more traffic to the app or portal. They usually focus on: connections, network, long term user engagement, types of user engagement, dormant member resurrection, new member registration, and referrals. They discover areas of opportunity, outline how to close the gap, guide product design, and analyze impact after product release.
There may be so many different acquisition channels involving other platforms, so the job of these data scientists is to help build pipelines that link company’s data with the third parties. These data scientists also help design and analyze scalable experiments with both offline (not tracked at a user level) and online (more traditional A/B experiments) tests.
The Product DS
The second category of data scientists focus on analyzing changes to the core product, including search rankings and price recommendations.
For example for a hotel booking company they’ll often use datasets with logged search queries, calendar preferences, and booking flow interactions. These projects tend to be a mix of experimental analysis and product development (with machine learning models and simulations).
The challenge for this category of scientists is to keep things simple. It’s easy to propose a complex solution. It’s hard to come up with a simple one. You may come up with a complex model but many a times you have to convey the business logic behind the model to a non-technical person. Always keep these 3 rules in mind while deciding on a product feature: Simplicity Makes People Happy, Simplicity Makes People Think Better, Simplicity Makes People Spend Money. By channeling your thought process like a business person, you’ll be able to avoid many of the common mistakes.
Then comes the transactional DS or behavioral DS who mostly work on developing models that use machine learning techniques to identify anomalies in transaction data and user behavior data, respectively. These data scientists are also responsible for analyzing any product changes associated with these models, like payment options or verified identifications. In addition to working with the Product team, these data scientists will also often partner with operations teams, like Sales representatives, fraud analysts and security teams.
Lastly, the operational DSs who partner with the rest of the departments.
For example an operational data scientist will interact with policy, finance, sales, billing, marketing and human resources.
For example, some companies would calculate a Net Promoter Score (NPS) and try to find out which are the 3 steps that can be taken to increase the the score. Now, these steps can’t be taken in isolation by a particular department of the company. Multiple departments have to coordinate and act in sync to implement such steps. This is where the operational data scientists or core analytics team come into picture. In addition to having the machine learning skills these people need to have a good understanding of the business and look at ideas from a practical angle and they should also have good people kills to interact and convince others about these ideas.
These data scientists tend to rely more on third-party sources (like surveyed, census, or employee data) and less on experimental data.