C207: Data-Driven Decision Making Answers

Section 1: Foundational Concepts in Data and Business (15 Questions)


  1. Question: What is the primary difference between Data and Information?
    • Answer: Data is raw, unorganized facts and figures. Information is data that has been processed, organized, and structured to provide context and meaning for decision-making.
  2. Question: What is the goal of Data-Driven Decision Making (DDDM)?
    • Answer: The goal of DDDM is to use empirical evidence (data) rather than intuition, experience, or opinion, to guide business strategy and operational choices, leading to better outcomes.
  3. Question: Define the three main types of data Analytics used in decision-making.
    • Answer:
      • Descriptive: What has happened (historical data).
      • Predictive: What could happen (forecasting future trends).
      • Prescriptive: What should happen (recommending a course of action).
  4. Question: What is the difference between a Population and a Sample in statistics?
    • Answer: A Population is the entire group you want to study or draw conclusions about. A Sample is a small, representative subset of the population used to collect data.
  5. Question: What is the purpose of a Key Performance Indicator (KPI)?
    • Answer: A KPI is a measurable value that demonstrates how effectively a company is achieving key business objectives. It helps focus analysis and decision-making on the most critical metrics.
  6. Question: What are the three V’s of Big Data?
    • Answer: Volume (massive amount of data), Velocity (the speed at which data is generated and processed), and Variety (the different types and sources of data).
  7. Question: Define Nominal and Ordinal data types.
    • Answer:
      • Nominal: Data used for labeling variables without any quantitative value or order (e.g., gender, color).
      • Ordinal: Data with a distinct order, but the differences between values are not known or consistent (e.g., satisfaction ratings like “Good,” “Better,” “Best”).
  8. Question: What is the difference between Interval and Ratio data?
    • Answer: Interval data has ordered categories and measurable differences, but no true zero point (e.g., temperature in Celsius). Ratio data has ordered categories, measurable differences, and a true zero point (e.g., height, weight, sales revenue).
  9. Question: What is a Database Management System (DBMS)?
    • Answer: A software system that allows users to define, create, maintain, and control access to a database.
  10. Question: What is the primary goal of Data Warehousing?
    • Answer: To consolidate and centralize data from disparate sources into a single, consistent format to support business intelligence activities, reporting, and analysis for decision-making.
  11. Question: What is the difference between Qualitative and Quantitative data?
    • Answer: Quantitative data is numerical and can be counted or measured (e.g., sales figures). Qualitative data is descriptive and non-numerical (e.g., customer feedback comments).
  12. Question: What is the function of a Histogram in data analysis?
    • Answer: A histogram is a bar graph that shows the frequency distribution of a numerical dataset. It helps visualize the shape of the data and identify its central tendency, spread, and skewness.
  13. Question: Define Data Mining.
    • Answer: The process of discovering patterns, anomalies, and correlations within large datasets to predict outcomes. It’s used to extract actionable insights.
  14. Question: What is Confirmation Bias, and why is it a threat to DDDM?
    • Answer: Confirmation bias is the tendency to seek out, interpret, or favor information that confirms or supports one’s prior beliefs or values. It is a threat because it leads decision-makers to selectively use data, ignoring evidence that contradicts their assumptions.
  15. Question: What is Data Governance?
    • Answer: A system of policies, procedures, and roles that defines how an organization manages its data assets, ensuring data quality, security, privacy, and integrity.

Section 2: Statistical Methods and Data Quality (15 Questions)


  1. Question: What are the three main measures of Central Tendency?
    • Answer: Mean (average value), Median (middle value), and Mode (most frequently occurring value).
  2. Question: What is Standard Deviation, and what does a high value indicate?
    • Answer: Standard deviation measures the amount of variation or dispersion of a set of values. A high value indicates that the data points are spread out over a wider range (high variability).
  3. Question: What does a Coefficient of Correlation (r) value of -0.9 indicate?
    • Answer: It indicates a strong negative (inverse) linear relationship between two variables. As one variable increases, the other variable strongly decreases.
  4. Question: What is the key distinction between Correlation and Causation?
    • Answer: Correlation indicates that two variables move together. Causation means that a change in one variable causes a change in another. Correlation does not imply causation.
  5. Question: What is the purpose of Regression Analysis?
    • Answer: To understand how the value of a dependent variable changes when one or more independent variables are varied. It is used for prediction and forecasting.
  6. Question: In statistics, what is a Hypothesis Test?
    • Answer: A formal procedure for deciding between two competing hypotheses, the null hypothesis (H0​) and the alternative hypothesis (Ha​), using data from a sample.
  7. Question: What is a p-value in hypothesis testing?
    • Answer: The p-value is the probability of observing the data (or data more extreme) if the null hypothesis is true. A small p-value (typically ≤0.05) suggests the evidence is strong enough to reject the null hypothesis.
  8. Question: What is Sampling Error?
    • Answer: The natural difference or variation that exists between a sample statistic (e.g., sample mean) and the actual population parameter (e.g., population mean).
  9. Question: What is the key characteristic of Random Sampling?
    • Answer: Every member of the population has an equal chance of being selected for the sample, which helps ensure the sample is representative and minimizes sampling bias.
  10. Question: Why is Data Cleansing/Scrubbing a critical step in the DDDM process?
    • Answer: It corrects or removes errors, incompleteness, duplicates, and inconsistencies from datasets. Using poor quality data leads to flawed analysis and poor decisions (“Garbage In, Garbage Out”).
  11. Question: What is Bias in data measurement?
    • Answer: A systematic or non-random error in the data collection process that causes the measured results to differ consistently from the true value.
  12. Question: What is the purpose of a Control Chart in process management?
    • Answer: A control chart is a statistical tool used to determine if a manufacturing or business process is in a state of statistical control (operating consistently within its expected boundaries).
  13. Question: Define Confidence Interval.
    • Answer: A range of values, derived from a sample, that is likely to contain the value of an unknown population parameter with a specified degree of confidence (e.g., 95% confidence).
  14. Question: What is Cross-Validation in the context of predictive modeling?
    • Answer: A technique used to assess how the results of a statistical analysis or model will generalize to an independent dataset. It helps prevent overfitting (a model that performs well on training data but poorly on new data).
  15. Question: What is the potential decision-making pitfall of Data Overload?
    • Answer: Data overload can lead to “analysis paralysis,” where decision-makers become overwhelmed by the sheer volume of data, leading to delayed or avoided decisions, or a reversion to intuition rather than data.

Section 3: Visualization, Modeling, and Ethics (15 Questions)


  1. Question: What is the main objective of Data Visualization?
    • Answer: To represent data and information visually (charts, graphs, maps) to make complex data understandable and actionable for decision-makers who may not be data experts.
  2. Question: What is a common pitfall of poor data visualization (e.g., misleading axes)?
    • Answer: It can create a misleading perception of trends or magnitudes, causing decision-makers to incorrectly interpret the data and make flawed choices.
  3. Question: What is the role of a Dashboard in DDDM?
    • Answer: A dashboard provides a central, real-time visual display of the most important metrics and KPIs needed to achieve specific business objectives.
  4. Question: What is a Decision Tree model?
    • Answer: A predictive model that uses a tree-like structure of decisions and their possible consequences, outcomes, and costs. It helps visualize complex conditional probability.
  1. Question: What is a key benefit of using Simulation Modeling (e.g., Monte Carlo)?
    • Answer: It allows decision-makers to model and test the outcome of a decision under hundreds or thousands of different uncertain conditions (risk and probability), leading to more robust decisions.
  2. Question: What is A/B Testing?
    • Answer: A method of comparing two versions of a single variable (A and B) to determine which one performs better. It is used to make data-driven decisions on user interface, marketing, and product changes.
  3. Question: What does it mean for an algorithm or model to be “Black Box”?
    • Answer: A model is “Black Box” when its internal workings are opaque and cannot be easily understood or explained. This is a problem for DDDM because it reduces transparency and accountability.
  4. Question: What is the ethical concern related to Algorithmic Bias?
    • Answer: Algorithmic bias occurs when a model’s output reflects unfair or discriminatory assumptions, often because the training data reflected existing societal biases, leading to unethical decisions (e.g., biased loan approvals).
  5. Question: How does GDPR (General Data Protection Regulation) impact data-driven decision making?
    • Answer: GDPR mandates strict rules for collecting, processing, and storing personal data of EU citizens, requiring organizations to ensure data privacy, consent, and the “right to be forgotten,” which limits data access and use.
  6. Question: What is the ethical principle of Data Security?
    • Answer: The obligation to protect data from unauthorized access, use, disclosure, disruption, modification, or destruction, particularly sensitive data like PII (Personally Identifiable Information).
  7. Question: What is the role of Scenario Planning in prescriptive analytics?
    • Answer: Scenario planning involves developing multiple plausible, yet distinct, possible future states (scenarios) to test how an organization’s decisions or strategies will perform under various circumstances.
  8. Question: What is the ethical issue of Data Misuse?
    • Answer: Using collected data for purposes other than those for which consent was given, or using data in a way that harms or unfairly targets individuals or groups.
  9. Question: What is the concept of “Data Literacy” for a non-data-expert decision-maker?
    • Answer: The ability to read, comprehend, analyze, and argue with data. It is essential for DDDM because leaders must be able to interpret analyses and question underlying assumptions.
  10. Question: What is the objective of Text Mining in DDDM?
    • Answer: To extract useful, actionable information from unstructured text data (e.g., customer reviews, emails, social media posts) to inform decisions about products, services, or customer sentiment.
  11. Question: How does Iterative Decision-Making relate to data?
    • Answer: It is a process of continually refining a decision or strategy by making small, quick changes, measuring the results with data, and using those results to inform the next iteration. This ensures the decision process is constantly informed by real-world feedback.