Businesses are developing an appetite for data science. According to a recent report from job site Indeed, the demand for data scientists increased by 29 percent year-on-year and by 344 percent since 2013. The role of data scientist has also been rated the best job in America for three years running by Glassdoor. As Andrew Flowers, an Economist at Indeed and author of the abovementioned report puts it: “The job of a data scientist has only grown sexier. More employers than ever are looking to hire data scientists.”
The rising tide of complex data volumes and the expanding deployment of technologies such as machine learning and AI have increased enterprises’ need for data science skills – however, the supply of data scientists is still struggling to keep up.
(Image source: techtarget.com)
And what about data analysts? It is not uncommon to find this title being used almost as a synonym for data scientist. But there are several distinct differences between the two roles in terms of education, skill sets, practice, and salary. Let’s take a deeper look.
Distinguishing Between Data Scientist and Data Analyst
A 2012 HBR article, which may have been the first to grant the title ‘Sexiest Job of the 21st Century’ to data scientists, defines the role as “hybrid data hacker, analyst, communicator and trusted advisor” with the “training and curiosity to make sense of big data.”
For a more formal definition, we turn to the industry standards published by the Institute of Apprenticeships (IfA). According to the ifA, data scientists “find information in diverse data sets to address complex problems and improve organizational processes.” They do this by “gathering new sources of data, performing statistical analysis, building and validating data models, using programming practices, and maintaining data, tools and processes to implement robust and valuable data solutions.”
By comparison, the ifA defines data analysts as those that “collect, organize and study data to provide business insight,” and are “typically involved with managing, cleansing, abstracting and aggregating” data for the purposes of doing analytical studies on that data.
It seems that data analysts are not exactly data scientists – but the two roles aren’t exactly worlds apart either, at least in terms of analytical aptitude. So, what are the key differences when it comes to data analytics vs. data science?
Data Science vs. Data Analytics
Data science is a multifaceted practice that draws from several disciplines to extract actionable insights from large volumes of unstructured data. These disciplines include statistics, data analytics, data mining, data engineering, software engineering, machine learning, predictive analytics, and more. Data science is as much about producing insights from large data sets as it is about finding more efficient and productive ways to model and analyze data. It has been described as an approach that can shed light even on issues that “we don’t know we don’t know”, meaning data science has the potential to deliver insights that can address problems we haven’t even identified yet.
Data analytics is a key component of the data science process, but its scope is narrower in terms of objectives and expected outcomes. The approach is more focused on finding answers to questions that have topical relevance and on delivering insights that can enable immediate outcomes.
(Image source: sisense.com)
One approach to illuminating the functional nuances between data science and data analytics is to map their scope onto “The Four Analytic Capabilities” framework put forward by Gartner Research in 2017. Gartner’s framework visualizes a comprehensive analytics environment progressing from traditional capabilities – namely “descriptive” and “diagnostic” – to the more complex techniques of “predictive” and “prescriptive”. Data analysts are more likely to be tasked with performing descriptive and diagnostic analysis when compared with data scientists, who work with predictive and prescriptive analytics.
(Image source: blog.cambridgespark.com)
Though it may seem like these two practices operate at two ends of the spectrum, they are still complementary functions within the broader effort to convert data into business value. For instance, a data analyst may extract a new data set, identify interesting trends, and present the results that will enable immediate business outcomes. A data scientist could then build on these initial findings, using advanced machine learning, software engineering and statistical techniques to predict the future.
Data scientists combine the core competencies of a data analyst with a host of advanced skills in machine learning, programming, mathematical modeling, etc. So, let’s now take a look at some in-demand skills for both data scientists and data analysts.
Defining Skill Profiles for Data Scientists and Data Analysts
Before we dive into the distinctive skill requirements for each of these functions, it’s necessary to highlight a couple of skills that are essential for both – domain expertise and communication.
Domain expertise is absolutely critical to both roles. Understanding the unique dynamics and challenges of a particular domain or industry is imperative in identifying projects that have a contextual relevance to the domain. For example, within a highly specialized domain like healthcare, a working knowledge of the variables related to a health outcome is absolutely critical to ensuring that the right model is built. Domain expertise is especially important for data analysts as they typically have shallower experience across a broad spectrum of skills.
Communication skills are also essential for both roles, as both data scientists and analysts will eventually have to present their findings and solutions to business users with little understanding of statistical analyses and mathematical modeling. As such, all data specialists need to have strong written and verbal communication skills that allow them to share their insights to stakeholders across the business.
So, what about the core skills of the separate roles? First, data analysts:
- Statistics: A strong foundation in mathematical and statistical concepts and above average literacy in data analytics are must-haves for any aspiring data analyst. More senior positions will require candidates with skills in applying techniques such as multivariate A/B testing, predictive modeling, trend analysis, and cluster analysis to real business situations. Analysts must also be well-versed with different data structures, storage methods, and creating robust data sets.
- Query & Analysis: Data analysts need to have deep expertise in Structured Query Language (SQL). They have to be skilled at collecting, organizing and managing data, creating customized queries, and even building database structures from the ground up. In addition, analysts also need to constantly update their data skills by exploring new languages and technologies.
- Programming: Analysts need to have programming experience – especially in languages like R, Python, and MATLAB, which are all widely used for statistical and predictive analytics on large data sets – as they will often be tasked with solving problems that out-of-the-box software isn’t powerful enough to handle.
- Data Visualization: Analysts must have the skills to deliver their findings and insights in a visually engaging manner that resonates with a non-technical audience.
Over and above all this, analysts with theoretical or practical exposure to machine learning techniques will have an edge when it comes to moving up the data science hierarchy.
So, what about data scientists?
Data scientists build on the core skills of data analysts, though need a range of specialized competencies across mathematics, programming and domain knowledge.
(Image source: techtarget.com)
Indeed recently came up with a list of the five most in-demand tech skills for data scientists based on postings on their sites. Three of these skills – R, Python and SQL – are shared with data analysts. The other two are:
- Machine Learning: Machine learning can provide a real impetus for data science with its ability to learn from data with minimal human intervention. But in order to leverage it effectively, data scientists will need to understand several machine learning methods, including linear and logistic regression, tree models and ensembles like Random Forest and GBM, and neural networks.
- Hadoop: Hadoop enables data scientists to handle large scale data that is both structured and unstructured. Additionally, it also provides modules for analyzing large volumes of data. Given that these two factors are fundamental to a data scientist’s role, hands-on knowledge of Hadoop and the platform’s utility in data operations can give data scientists the competitive edge.
Apart from these technical capabilities, the data science team at Sequoia Capital has also defined five connected skills that are required of any data scientist.
- Problem Formulation: the ability to breakdown complex business problems and pose them as a set of technical problems.
- Technical Ability: the technical skills required to extract data.
- Analytics Ability: the ability to manipulate data and extract value.
- Synthesis: the ability to connect the information and insights produced back to their problem formulation questions.
- Influence: the ability to drive decision making by creating an impactful yet succinct presentation.
Data science is an evolving practice and skills remain in high demand. Given the vast difference in expectations between data scientists and data analysts, the former do tend to be better remunerated. There are, however, some signs that salaries for data scientists may have begun to plateau. According to data from Glassdoor Economic Research – cited in TechRepublic – data scientist salaries actually shrunk 1.2 percent in March 2019. According to Stack Overflow, data scientist salaries were leveling off against software developer salaries.
This is kind of surprising given all fuss about the short supply of data scientists. But the trend is attributable to the explosion in data science courses which has resulted in a fivefold increase in applicants for entry-level data science roles – without doing much to actually address the shortage of skilled data science professionals.
As such, even as data volumes continue to grow exponentially, a majority of businesses still lack the capabilities to analyze or categorize all the data they have stored. Of course, in order to do this, there will be continued demand for data science talent, including data scientists, data analysts, and data engineers. Currently, data scientists tend to get higher positions and salaries, but they are also expected to have advanced degrees and strong technical skills across a range of statistical, mathematical and programming techniques. The demand for data analysts is just as high, though expectations and salaries are comparatively lower. But with the right technical skills and competencies, even entry-level data analysts could eventually qualify for the sexiest job of the 21st century.