Research Brief: Four Functional Clusters of Analytics Professionals Authored by: Pasha Roberts, Chief Scientist Greta Roberts, CEO July 2013 ©2013 Talent Analytics, Corp. Key Findings Digging into a cross-‐industry study of analytics professionals, we identify four distinct patterns of how these workers spend their week: (1) Data Preparation, (2) Manager, (3) Programmer, and (4) Generalist. These functional clusters are defined by unique time-‐usage patterns, and further, exhibit important differences across dimensions of education, demographics, and mindset. This brief quantifies these characteristics, and their deep implications for sourcing, hiring, managing, and retaining analytics professionals in each category. Four Functional Clusters of Analytics Professionals The groundbreaking 2012 Analytics Professionals Study by Talent Analytics, Corp and the International Institute for Analytics utilized many measures to understand the characteristics of modern analytics professionals and data scientists1. The study examined 302 active analytics professionals in a diverse sample of companies, industries, sizes and circumstances. For the purpose of this paper we will use the terms data scientist and analytics professional interchangeably. Our premise? Data scientists have been discussed as if they performed a single role. We suspected this role was wider, containing a broader workflow of tasks. Clarifying the analytics workflow and tasks performed by data scientists could also provide insight into those looking to hire and retain professionals in this important role. To this end, we asked participants in our study how many hours per week they spent in various analytics-‐related activities. Every attempt was made to capture and reflect tasks in a modern analytics workflow. The study also gathered 11 factors pertaining to an individual's “raw talent” factors, also known as aptitude or mindset. Aptitude can be distinguished as different from achievement. This study will show how aptitude and other factors differ across job types. 1 See IIA Research Brief Quantifying Analytical Talent for additional results of the study. ©2013 Talent Analytics, Corp. All rights reserved Page 2 Time Spent on Tasks in Analytics Workflow In aggregate, the study sample spent roughly the same percentage of time performing tasks in the following 11 categories: Analytics Function Analysis Design Data Acquisition and Collection Data Preparation Data Analytics Data Mining Visualization Programming Interpretation Presentation Administration Managing other Analytics Professionals Percent of Week (Mean) 8.9% 10.8% 12.5% 13.0% 7.4% 6.8% 8.9% 8.7% 8.2% 6.3% 8.4% However, upon deeper analysis it became clear that there were several “types” of workweeks at hand in this sample. Fuzzy Clustering Yielded Best Results for Understanding Analyst Work When we reviewed the 11 different analytics functions, 4 categories emerged, each containing multiple functions. Cluster Data Preparation Programmer Manager Generalist Count 78 51 37 136 Percent 26% 17% 12% 45% Several algorithms were examined to perform cluster analysis upon this sample, and “Fuzzy Clustering” delivered the best results. This method implies that each item belongs to each cluster to some degree, which makes sense given the fluidity of most analysts' work. The best results were found with four clusters, which were named based on their dominant activities. ©2013 Talent Analytics, Corp. All rights reserved Page 3 Reading Density Plot Chart Types (as seen below): This brief uses a chart type known as a “Density Plot”. A “bell curve” is a density plot. It displays the estimated population percentage for each possible value of a variable, that is the “density” of that variable. The horizontal X axis measures a single raw talent metric (like Curiosity) on a scale of 1 – 100. The farther to the right, the closer the score is to 100 , and the more Curious the individual is. The vertical Y axis shows the percentage of the population estimated at each raw talent score. The higher up, the more of the population will have this score at X. In a random sample of people, the same number of people would have a Curiosity score of 29 as would have a Curiosity score of 78. Therefore, a random sample of people would display as a flat line at 1% (we drew a dotted line at 1% to show where a random sample of people would score.) When viewing our Density plots, the most interesting information is found when the line is below or above the dotted line (above or below the random sample line). Analysts in All Functional Clusters Have Important Similarities For the purpose of this paper we focus primarily on teasing apart differences in the 4 functions inside of the data scientist role. These differences have implications for hiring, promoting and retaining. Before we begin, it is interesting to note similarities among the functional clusters before we tease apart differences. ©2013 Talent Analytics, Corp. All rights reserved Page 4 Similarities Seen in All Functional Clusters Analysts in all 4 functional clusters have two things in common: 1) very strong intellectual curiosity (Theoretical Drive, see Figure 1) and 2) strong drive to create out of the box solutions (Creative Drive, see Figure 2) 4% All Clusters Skew High. Clearly Curiosity is a “must” regardless of function in analytics role Probability Density 3% 2% Functional Cluster Data Preparation Generalists Managers Programmers 1% 0% 0 25 50 TA Score 'THE' 75 100 Figure 1: Level of Intellectual Curiosity. (the further right, the more curious.) All Clusters Skew High Probability Density 1.5% Functional Cluster 1.0% Data Preparation Generalists Managers Programmers 0.5% 0.0% 0 25 50 TA Score 'CRE' 75 100 Figure 2: Level of Creativity (the further right, the more creative) ©2013 Talent Analytics, Corp. All rights reserved Page 5 Four Functional Clusters Cluster 1: Data Preparation Analysts Analytics professionals in the first group, the Data Preparation cluster, spend a significant amount (46%) of their time gathering and preparing data for analysis used later on in the analytics workflow. design 7.2% dataAcq 21.4% Analytics Function dataPrep 24.5% analytic 11.6% mining 10.5% vis 4.1% prog 4.1% interp 6.3% present 5.4% admin 2.9% manage 2.0% 0% 5% 10% 15% Percent Time per Week 20% 25% Figure 3: Time by Analytics Function for Data Preparation Cluster How are Data Preparation Analysts Different from other Analysts? • One interesting difference is that they are decidedly less Competitive (less Politically motivated). This group is less interested in advancing up the hierarchy, competing for an internal promotion as a reward or engaging in power struggles with management, other analysts, employees or their customers. (See Figure 4) ©2013 Talent Analytics, Corp. All rights reserved Page 6 Data Prep and Programmers Low Competitively Probability Density 1.5% Functional Cluster Data Preparation 1.0% Generalists Managers Programmers 0.5% 0.0% 0 25 50 TA Score 'POL' 75 100 Figure 4: Density of Drive to Compete and Win (the further right, the more motivated to compete) • Our analysis showed Data Preparation analysts showed the strongest aptitude for detail and are least likely to make mistakes. This seems to make sense as a Data Preparation role requires constant attention to detail and attracts those who embody this quality. (See Figure 5) 1.5% Probability Density Data Prep High Detail 1.0% Functional Cluster Data Preparation Generalists Managers Programmers 0.5% 0.0% 0 25 50 TA Score 'E' 75 100 Figure 5: Density of Drive to be Exact, Accurate, Mistake-‐free (the further right, the more detailed, exact, precise in their work) ©2013 Talent Analytics, Corp. All rights reserved Page 7 Sourcing, Hiring, Managing and Retaining Data Preparation Analysts • • • • Sourcing: Data Preparation candidates are likely to be found in other areas of your organization, particularly roles that are detail rich. Remember aptitude for detail and accuracy needs to be reviewed in addition to a requirement for strong intellectual curiosity and creativity. Of course, it is vital to also to evaluate intelligence and training for such a role, but Data Preparation requires the least statistical domain knowledge out of the four clusters. Hiring: In trying to incent candidates to this role, do not focus on political growth, career advancement, or a future with lots of senior level visibility. Managing: Data Preparation staff will want details about their goals and performance vs. general comments. They will keep details on their own projects and performance and be disappointed if this is not similarly tracked. Retaining: The work in the Data Preparation cluster is definitely on the “back office” side of analytics, but it is a large and essential function. Professionals in this role share much of the creativity, intellectual interests and beliefs as other analytics professionals. They will easily become bored and leave for another role that satisfies their intellectual curiosity more fully. Ultimately they are looking for a mentally challenging role that appeals to their natural curiosity and creativity more than career advancement. Cluster 2: Analytics Programmers The second functional cluster identified consists of analytics professionals whose workweek is weighted more heavily toward programming – writing computer code to manipulate and process data. They spend more than 3 times the time programming than any of the other clusters. That being said, analytics Programmers still only spend one third of the time, on average, programming. The rest of the time they spend on other analytics-‐related activities, like other analysts. design 6.0% dataAcq 8.1% Analytics Function dataPrep 10.2% analytic 14.2% mining 5.8% vis 6.9% prog 32.8% interp 7.2% present 4.7% admin manage 2.3% 1.7% 0% 10% 20% Percent Time per Week 30% Figure 6: Time by Analytics Function for Programmer Cluster ©2013 Talent Analytics, Corp. All rights reserved Page 8 This is the youngest age group; almost half are under 29 years old. Programmers are the least experienced group; again over half have less than 5 years of either business or analytics experience. This cluster is similar to the Data Preparation cluster in their lack of desire to climb the corporate ladder. They have no goals of heading a large organization to increase their stature inside of an organization. (See Figure 4) From an aptitude perspective, Programmers have the strongest desire to collaborate, making sure they gain alignment on their work. (see Figure 10) • • • • Sourcing, Hiring, Managing and Retaining Analytics Programmers • • • • Sourcing: Look for analytics Programmers inside your organization in current programming roles. Make note of those that install beta-‐versions of software, pushing the limits of functionality or who constantly experiment. Recent college graduates could be a good source; ask what projects they’re working on even if just for their own personal projects. Hiring: Candidates in this role are most interested in learning new software, staying on the leading edge of technology and analytics, being given some free reign to experiment and explore and be involved in continuous learning. The farther away they get from doing hands on work the more bored and dissatisfied they will get. Managing: Given the age and experience level of Programmers, managers would be wise to take the time to mentor them about general business knowledge, business expectations and perhaps how to maneuver politically inside of an organization. Their lack of political savvy could land them in trouble if not given some insight and boundaries. They will be a quick study, and will be happy to learn. Retaining: Like all analytics professionals this cluster will easily become bored. Financial incentives and promises that they will move up the ladder will not appeal to them nor make them feel valued or challenged. Ultimately they are looking for a mentally challenging role that appeals to their natural curiosity and creativity more than career advancement. ©2013 Talent Analytics, Corp. All rights reserved Page 9 Cluster 3: Analytics Managers The third cluster, Analytics Managers, report they spend more than half their time, on average, managing their analytics team and performing a variety of administrative tasks. Their workload leans towards managing direct reports and projects, and then presenting results of projects to their customers. design 8.9% dataAcq 3.1% Analytics Function dataPrep 2.8% analytic 4.9% mining vis prog 3.8% 1.9% 0.9% interp 7.6% present 9.6% admin 19.2% manage 37.3% 0% 10% 20% Percent Time per Week 30% Figure 7: Time by Analytics Function for Managers Cluster How Analytics Managers Differ from other Analysts • Managers are the oldest of all other functional clusters – less than 10% are under 29. • Along with Analytics Generalists, Managers showed broader experience and more years as an analyst than other groups (which makes sense along with their more advanced age). • Managers accounted for the largest number of Ph.Ds. (27%), yet most Managers (40%) were only at a Masters level. • Managers had a strong competitive and political aptitude, quite different from the other 3 clusters. o Managers were more competitive than any other cluster (see Figure 4), reflecting their choice to lead, compete and engage with the corporate hierarchy. o Interestingly, Managers had higher and consistent “altruistic” scores, as well, signaling a drive to mentor and help their team; as well as empathy with clients who find out tried and true, older approaches could be easily improved by newer analytics models. (see Figure 8). That said, very few of the sample exhibited “altruistic” scores that were at the top of the scale. ©2013 Talent Analytics, Corp. All rights reserved Page 10 Managers Higher Altruistic Probability Density 1.5% Functional Cluster 1.0% Data Preparation Generalists Managers Programmers 0.5% 0.0% 0 25 50 TA Score 'ALT' 75 100 Figure 8: Density of Drive to be Compassionate and Empathetic (the further right, the more compassionate) • Finally, our sample of Managers displayed less of a drive to focus on “Economic” results than all other groups. This is curious and unexpected by the study team. This indicates Managers (at least today) are focusing more on knowledge and leadership-‐oriented drivers such as “being the smartest person on the team” and “taking care of their team” while analysts at a lower level focus more on attaining tangible results. (see Figure 9) Managers Lowest focus on economics and bottom line results Probability Density 1.0% Functional Cluster Data Preparation Generalists Managers Programmers 0.5% 0.0% 0 25 50 TA Score 'ECO' 75 100 Figure 9: Density of Drive to Achieve Bottom Line Results, or to See ROI (the further right, the more focused on results, including personal financial results) o Managers (and Generalists) showed a high score in having a “Command and Control” approach to managing. This is quite a contrast to the other two clusters. (see Figure 10) ©2013 Talent Analytics, Corp. All rights reserved Page 11 Managers, Generalists Most Assertive Mgmt. Style 1.5% Probability Density 1.0% Functional Cluster Data Preparation Generalists Managers Programmers 0.5% 0.0% 0 25 50 TA Score 'C' 75 100 Figure 10: Density of Drive to use an Assertive Management Style (the further right, the more bold and confident the approach) Sourcing, Hiring, Managing and Retaining Analytics Managers • • • • Sourcing: Management candidates can be located from existing analytics professionals or other management areas inside the organization. They will be easy to identify, as they will be very focused (even from the point of being a candidate) on what they need to do to advance and move up the management chain. – As the discipline of analytics progresses, we wonder if the management role will change from one where the manager needs to be the smartest person in the room to a role where they have a team of smart, curious problem solvers, and they are more focused on results while keeping everyone else on target. – It might be interesting to source management candidates from other internal business areas where a manager’s analytics expertise is less but their management aptitude is more advanced. Hiring: Management candidates will be focused on advancing and one day managing their own team of direct reports. If advancement is a real opportunity this would be something to point out in a job ad, job description or interview. But know, if they come on board, they will not forget that they were told they could advance and will feel robbed if this isn’t addressed. Managing: Those with management aspirations will feel less motivated until their path for advancement is clear. They will see leadership and visibility to other leaders as a bonus and something to strive for. Given that they have less of a focus on results, perhaps their advancement could be tied to goal achievement and completing projects on time, etc. Retaining: Managers in our Study showed that what they care most about is learning, not being bored, and begin able to advance inside the organization. Ironically our sample of managers were least interested in financial rewards. So as with other types of analytics professionals, additional money won’t be turned down, but it won’t be a factor in retaining them in a job they don’t enjoy. ©2013 Talent Analytics, Corp. All rights reserved Page 12 Cluster 4: Analytics Generalists One cluster of analytics professionals, Analytics Generalists, did not report spending significant time in any focused area. Generalists in our study were found in a wide variety of company and industries. Contrary to the study’s original hypothesis, Generalists are found in very large organizations, as well as small companies. We suspect Generalists work in all sized companies is because: • • Analytics team sizes continue to be small even in the largest organizations with the largest analytics teams (78% of analytics professionals in our study worked on teams of 10 or less people), and Analytics hiring managers are early in realizing that Data Science isn’t a single role. We suspect the role will continue to be defined over the next 12 – 24 months as the discipline advances and matures. design 11.0% dataAcq 7.9% Analytics Function dataPrep 9.2% analytic 15.5% mining 7.3% vis 9.7% prog 4.9% interp 10.9% present 10.7% admin 6.2% manage 6.6% 0% 5% 10% Percent Time per Week 15% Figure 11: Time spent by Analytics Function for Generalists Cluster How Generalists Differ from Other Analysts • Analytics Generalists appear to be a hybrid of the “raw talent” traits contained in the other 3 Functional Clusters. • They could be described as most like Managers with less inclination to be political or controlling, and more inclination for tangible results while tending towards doing careful and detailed work. Tips for Hiring, Sourcing, Managing and Retaining Generalists • We don’t necessarily advocate actively looking for or hiring a Generalist. Given some of the differences we’ve seen in the data, even with regards to how Data Scientists spend their time we would suggest actively considering how your analyst is going to spend their time and hire to those requirements. Being clear about role requirements will increase business performance, job satisfaction and will reduce top analysts leaving (and saying it was because of money, which we know it isn’t). ©2013 Talent Analytics, Corp. All rights reserved Page 13 Business Conclusions Division of Labor The field of data analytics is going through rapid change as new data sources and new business opportunities emerge. By nature, very few people are well suited to do everything on the spectrum of analytics – to clean data AND program AND analyze AND present AND manage. This is unrealistic and does not scale. This Study reveals an ongoing trend to divide the work up between Preparation, Programming, and Management. Ironically, the analysis and visualization stage is rather small by comparison. It appears that some Generalists are in this role for organizational reasons rather than aptitude or personal preferences. Meaning, it could be that today’s Generalists have been placed in this role not because they are great at this, but because their analytics role is less well defined and they were hired to “do everything”. Generalists are found in small organizations, where it may be necessary to do everything, and in large organizations, which could easily specialize, but do not seem to deploy a division of analytics labor. If this proves to be the case, over time today’s analytics discipline will mature and analytics teams will begin to divide workers into more specialized tasks – like the clusters we’ve identified. When this happens, it could be that a group of “True Generalists” will remain, or perhaps these will emerge as the true “Analysts”. Suggestions around Promoting Analysts into Management Positions • The optimal pattern for a Manager is different from patterns for other roles that emerged in the Study. Specifically, only the Manager Cluster have the necessary mentoring and coaching mindset required for effective managers and for moving up the organization. • It is important to identify and promote analysts who are “made for a management role” and to offer this role to Analysts that see it as a true reward, rather than as something a great analyst needs to do to keep their job. Otherwise, the firm will potentially lose a good analyst, and gain a bad Manager. • Hiring and promoting individuals into management roles with aptitude results close to the optimal pattern will produce the best managers. While few will exactly match the Manager benchmark to a digit, the differences from the pattern will indicate areas for coaching and growth. ©2013 Talent Analytics, Corp. All rights reserved Page 14 About the Authors: Pasha Roberts is Chief Scientist of Talent Analytics, Corp. Pasha is responsible for all analytics and technology strategy, architecture, development and algorithms of Talent Analytics SaaS solutions. He wrote the first implementation of Talent Analytics’ software over a decade ago, and today continues to drive innovation for Talent Analytics’ flagship solution, Advisor™. As is often found in Data Science, Pasha has decades of experience/education that span computing, quantitative, artistic and business categories. He is also a “Blue Match” to Talent Analytics’ Data Scientist Industry Benchmark (ideal predicted match for a Data Scientist role). Pasha holds a Bachelors degree in Economics and Russian Studies from the College of William and Mary, and a Masters of Science degree in Financial Engineering from the MIT Sloan School of Management. His thesis at MIT prototyped the application of 3D graphics to massive financial “tick” datasets Greta Roberts is CEO of Talent Analytics, Corp. Greta is a sought-‐out thought leader, presenter, and author as well as a Faculty Member of the International Institute for Analytics. Under her direction, Talent Analytics has grown to be a leader in predicting and optimizing employee performance – the logical step beyond predicting customer behavior. She is an invited speaker at Predictive Analytics World, SAP Sapphire NOW, the SAS Analytics 2013 Conference among others. She has been frequently interviewed and quoted in Data Informed, AllAnalytics, KDNuggets, Smart Data Collective, Analytic Bridge, Forbes, Computerworld, VentureBeat and Sales & Marketing Management magazine. Note: To license Talent Analytics Data Scientist benchmark to help build your analytics bench, please contact Talent Analytics directly for more information: 617-‐864-‐7474 x.111 or [email protected] ©2013 Talent Analytics, Corp. All rights reserved Page 15
© Copyright 2026 Paperzz