Data-Driven?
The Data Science field has probably the most buzzwords that we like to throw around. “Big Data”, “Cloud Computing”, “Data-Driven”, name them. But do we really understand what these words are and how they are applied in the field?
Data-driven, for instance, You will hear organization leaders pride themselves in their colorful pitch decks, of how they use data-driven decision-making to run their businesses, but clearly cannot walk you through what DDDM, as we like to call it, really entails. Let’s scratch the surface for that, shall we?
DDDM is sexy, but what does it really entail?
A common understanding of DDDM is we are using data to drive decisions to improve the effectiveness and efficiency of our processes. This involves taking data and finding patterns in it. Then, based on the patterns we find in the data, we make decisions that govern our business. Aka: Using Data to Drive Decisions (hence the name). The goal of DDDM is to use data to influence or drive some sort of action. And there are steps to doing that.
Step 1: Know your mission, Why are you doing what you are doing, and for who?
Many companies make frequent assumptions about their products or market. For example, they might believe, “A market for this product exists,” or, “This is what our customers want.” But before seeking out new information, first put existing assumptions to the test. Proving these assumptions are correct will give you a foundation to work from. Alternatively, disproving these assumptions will allow you to eliminate any false claims that have, perhaps unknowingly, been negatively impacting your company. Keep in mind that an exceptional data-driven decision usually generates more questions than answers.
Know the specifics of your business and understand what it is you look to achieve and for who. By determining the precise questions you need to know to inform your strategy, you’ll be able to streamline the data collection process and avoid wasting resources. Also, here a lot more specific than just using data to drive decisions, it is more geared towards getting value from the data rather than just improving efficiency. Understanding your target audience is an important part of this initial stage of DDDM, as you will use these target users as input for what makes sense to them. This helps understand what patterns you may see in the later stages.
Step 2: Identify Data Sources
Identify what data you have and how you can allocate it to your target users. This may be based on demographics, behavior studies, or even analytics. Put together the sources from which you’ll be extracting your data. You might be coordinating information from different databases, web-driven feedback forms, and even social media. While it may be a tremendously difficult task to find common variables in various datasets from which to draw insights, one way to soften it is by coming up with possible scenarios, and how the available data relates to the target audience. Case in point, it would be almost useless to extract data on the most liked NBA players, while in the food production business.
Step 3: Clean and Organize your data
“80/20 Rule”. 80% of the data analyst’s time is mainly used cleaning and organizing data while the remaining 20% is used doing the actual analysis. This step entails preparing raw data for analysis by removing or correcting data that is incorrect, incomplete, or irrelevant. Organizing the data in tables, in the right formats, and data types ready for processing. This is necessary because you will be using this data to input into decision-making algorithms that really can only work with clean, organized data.
In another article we shall go deeper into ETL(Extract, Transform and Load)-a data integration process that combines data from multiple data sources into a single, consistent data store that is loaded into a data warehouse or other target system.
Step 4: Perform Analysis
Here, you will start to build models to test your data and answer the business questions you identified in step 1. Testing different models such as linear regressions, decision trees, random forest modeling, and others can help you determine which method is best suited to your data set. Another step that needs a whole sequel to go deeper. Next step…
Step 4: Demonstrate your Findings
Clarifying how the information will be most effectively presented will help you remain organized when it comes time to interpret and draw insights from the data. There are three different ways to demonstrate your findings:
Descriptive Information: Just the facts.
Inferential Information: The facts, plus an interpretation of what those facts indicate in the context of a particular project.
Predictive Information: An inference based upon facts and advice for further action based on your reasoning.
Step 5: Finally… Conclusion
Ask yourself, “What new information did you learn from the whole process?” Despite pressure to discover something entirely new, a great place to start is by asking yourself questions to which you already know — or think you know — the answer. “ Were your assumptions approved or disapproved?”
Are there patterns that came up during the analysis process and what would they mean to your business?
The conclusions drawn from your analysis will ultimately help your organization make more informed decisions and drive strategy moving forward. It is important to remember, though, that these findings can be virtually useless if they are not presented effectively. Thus, data analysts must become skilled in the art of data storytelling to communicate their findings with key stakeholders as effectively as possible.
So what now?
The above steps do not fully encapsulate what the DDDM process is, but rather give a high-level perspective of what to expect or to be expected of you if you claim to be data-driven.
If you are looking to be more informed in the field, I’d recommend learning platforms such as LinkedIn Learning and Coursera that provide courses and certifications in the Data Science pathways, or you can enroll in a higher learning institution for a full-time program on the same.