Here, we'll talk about how to build and structure your data team to meet your organization's needs.
Members of your team
You might be surprised to learn that data science isn't a single field, it's actually 3 different jobs. Each position uses a slightly different sets of tools to achieve their goals.
- Stores and maintains data.
- Controls flow of information
- Builds specialized data storage systems and infrastructure to ensure that the data is easy to obtain and access.
- Most data engineers are very familiar with SQL which they use to store and manage databases
- They also use programming languages such as Java, Scala or Python to process data and automate data related tasks
- Describes the present via data
- They do this with dashboards, hypothesis testing, and visualization
- They often have some background in statistics or computer science but tend to have less engineering experience than data engineers and less math experience than data scientists.
- Data analysts use spreadsheets to perform simple analysis on small quantities of data.
- They use SQL, the same language used by data engineers for larger analysis.
- While data engineers build and configure SQL storage solutions, data analysts use existing databases to consumer and summarize data.
- Analysts also use business intelligence or BI tools such as Tableau, Power BI or Looker to create dashboards and share their analysis
Machine learning scientist
- Machine learning is perhaps the buzziest part of data science. It is used to extrapolate what is likely to be true from what we already know.
- These scientists use training data to classify larger, unrulier data.
- Machine learning can tell us, how much money the stock might be worth next week, which images contain a car or what sentiments are expressed by a set of tweets
- Machine learning scientists use either Python or R to create their predictive models. Both are great programming languages for data science and a candidate who knows one language can likely read code in the other language.
- Remember, programming languages aren't as difficult to learn as spoken languages; if someone knows how to speak French, it might take them years to learn how to speak German.
- Programming languages are more similar to power tools, if you know how to use a power drill, you don't necessarily know how to use an electric saw but, you can probably learn a little while training.
Data Science team structure
Once you've hired some data professionals, there are 3 main ways you can structure your data team.
- It can contain one or multiple types of data employees without any teams like engineering or product.
- This is a great team structure for training new team members and changing which project each team member is working on.
- Alternatively, it can be helpful to use embedded model where each data employee is part of a squad which also contains engineers and product managers.
- This model let's each data employee gain experience on a specific business project making them a valuable expert.
- The hybrid model looks similar to embedded model but with additional sync for all data employees across all squads.
- This additional layer of organization allows for uniform data processes and career development regardless of which project and employee is assigned to.