Section 1: Foundational Concepts and Data Classification (15 Questions)
- Question: What is the primary goal of Quantitative Analysis for Business?
- Answer: To apply mathematical and statistical techniques to raw data to extract meaningful insights that support objective, data-driven business decisions.
- Question: What is the difference between Descriptive Statistics and Inferential Statistics?
- Answer: Descriptive Statistics summarize and describe the main features of a dataset (e.g., mean, median). Inferential Statistics draw conclusions or make predictions about a larger population based on a sample of data.
- Question: Define a Population and a Sample in statistical terms.
- Answer: A Population is the entire group of items or individuals being studied. A Sample is a representative subset of the population used for data collection and analysis.
- Question: What is the difference between a Parameter and a Statistic?
- Answer: A Parameter is a characteristic of a population (e.g., population mean μ). A Statistic is a characteristic of a sample (e.g., sample mean xˉ).
- Question: Name and define the four main levels of Data Measurement.
- Answer:
- Nominal: Categorical, no order (e.g., product type).
- Ordinal: Categorical, has a meaningful order (e.g., satisfaction ratings).
- Interval: Numerical, ordered, consistent differences, but no true zero (e.g., temperature in Celsius).
- Ratio: Numerical, ordered, consistent differences, and a true zero point (e.g., sales revenue, height).
- Answer:
- Question: What is the most flexible level of measurement and why is it preferred for analysis?
- Answer: The Ratio level, because it allows for the most sophisticated statistical techniques, including multiplication and division, due to the presence of a true zero.
- Question: Define Time Series Data.
- Answer: Data collected over successive time periods (e.g., monthly sales figures, quarterly profits) to track changes and identify trends.
- Question: What are the three most common measures of Central Tendency?
- Answer: Mean (average), Median (middle value), and Mode (most frequent value).
- Question: Which measure of central tendency is least affected by outliers?
- Answer: The Median, as it is based on the position of the data and not the actual value of extreme scores.
- Question: What is the purpose of measuring Dispersion (or variability) in data?
- Answer: To understand the spread or scattering of data points around the central value. Key measures include Range, Variance, and Standard Deviation.
- Question: Define Standard Deviation.
- Answer: The square root of the variance; it measures the average distance of data points from the mean. It is the most common measure of risk in finance.
- Question: What does a Coefficient of Variation (CV) measure?
- Answer: The CV is a measure of relative variability; it expresses the standard deviation as a percentage of the mean. It’s useful for comparing the dispersion between datasets with different means.
- Question: What does a Skewness value of zero indicate about the data distribution?
- Answer: A skewness of zero indicates a perfectly symmetrical distribution (like the normal bell curve) where the mean, median, and mode are approximately equal.
- Question: What is an Outlier, and why must a business analyst pay attention to them?
- Answer: An outlier is an observation point that is distant from other observations. They must be checked because they can distort statistical results (especially the mean and standard deviation) or indicate critical errors/opportunities.
- Question: What is a Box-and-Whisker Plot, and what does the box represent?
- Answer: A graphical display of the five-number summary (Min, Q1,Q2/Median,Q3, Max). The box represents the Interquartile Range (IQR), which contains the middle 50% of the data.
Section 2: Probability and Distribution (15 Questions)
- Question: Define Probability. * Answer: A numerical measure of the likelihood that an event will occur. Its value is always between 0 (impossible) and 1 (certain).
- Question: What is a Random Variable? * Answer: A numerical description of the outcome of a statistical experiment or a random phenomenon.
- Question: Distinguish between a Discrete Random Variable and a Continuous Random Variable. * Answer: A Discrete RV can only take on a countable number of values (e.g., number of defects). A Continuous RV can take on any value within a given range (e.g., time, weight).
- Question: What does the Expected Value (E(x)) of a random variable represent? * Answer: The weighted average of all possible values; it is the long-run average outcome if the experiment is repeated many times.
- Question: When is a Binomial Distribution appropriate for modeling a situation? * Answer: When there are a fixed number of independent trials (n), each trial has only two possible outcomes (success/failure), and the probability of success (p) is constant for every trial.
- Question: What key assumption must be met to use the Poisson Distribution? * Answer: It models the number of occurrences of an event in a fixed interval of time or space, where events occur independently and at a constant average rate (λ).
- Question: What is the most important characteristic of the Normal Probability Distribution? * Answer: It is a symmetrical distribution, with the shape of a bell curve. The mean, median, and mode are all equal and located at the center.
- Question: What is the Empirical Rule (or 68-95-99.7 Rule) for a Normal Distribution? * Answer:
- Approx. 68% of the data falls within ±1 standard deviation (σ) of the mean (μ).
- Approx. 95% falls within ±2σ of μ.
- Approx. 99.7% falls within ±3σ of μ.
- Question: What is a Z-Score (Standardized Value), and what is its purpose? * Answer: A Z-score measures how many standard deviations a data point is away from the mean. It allows comparison of data points from different normal distributions.
- Question: State the Central Limit Theorem (CLT) in simple terms. * Answer: Regardless of the shape of the population distribution, the distribution of sample means will tend toward a Normal Distribution as the sample size (n) becomes sufficiently large (n≥30).
- Question: What is the Standard Error of the Mean? * Answer: The standard deviation of the sampling distribution of the sample means. It measures the typical amount of error when using a sample mean to estimate a population mean.
- Question: What is a Point Estimate? * Answer: A single value (e.g., the sample mean xˉ) used to estimate an unknown population parameter (e.g., the population mean μ).
- Question: What is a Confidence Interval? * Answer: A range of values, calculated from sample data, that is likely to contain the true value of a population parameter with a specified level of confidence (e.g., 95%).
- Question: What effect does increasing the confidence level (e.g., from 90% to 99%) have on the confidence interval width? * Answer: Increasing the confidence level will increase the width of the confidence interval.
- Question: What is the primary purpose of conducting a Hypothesis Test? * Answer: To formally test a claim or belief (hypothesis) about a population parameter using evidence provided by sample data.
Section 3: Regression, Modeling, and Decision Tools (15 Questions)
- Question: What are the two competing hypotheses in a statistical test? * Answer: The Null Hypothesis (H0) (the status quo, what is assumed to be true) and the Alternative Hypothesis (Ha or H1) (the claim the researcher is trying to support).
- Question: What is a Type I Error (α) in hypothesis testing? * Answer: Rejecting the Null Hypothesis (H0) when it is actually true (a “false positive”). The probability of this error is the significance level (α).
- Question: What is a Type II Error (β) in hypothesis testing? * Answer: Failing to reject the Null Hypothesis (H0) when it is actually false (a “false negative”).
- Question: What is the function of the p-value in hypothesis testing? * Answer: The p-value is the probability of observing the sample data (or data more extreme) if the null hypothesis is true. If the p-value≤α, we reject H0.
- Question: Define Simple Linear Regression in the context of business analysis. * Answer: A statistical method used to model the relationship between two variables: one dependent variable (Y) and one independent/predictor variable (X) by fitting a straight line to the data.
- Question: In regression analysis, what does the Coefficient of Determination (R2) measure? * Answer: It measures the proportion of the total variation in the dependent variable (Y) that is explained by the independent variable (X). A higher R2 indicates a better fit.
- Question: What does a Coefficient of Correlation (r) value close to −1 indicate? * Answer: A strong negative linear relationship, meaning as the independent variable increases, the dependent variable strongly decreases.
- Question: What is a common pitfall of regression analysis that an analyst must avoid? * Answer: Assuming causation based only on correlation. Regression shows association, not necessarily that X causes Y.
- Question: What is Multicollinearity in the context of Multiple Regression? * Answer: A statistical phenomenon where two or more independent variables in a multiple regression model are highly correlated with each other, which can make the estimates of the individual predictor coefficients unreliable.
- Question: What is a Decision Tree? * Answer: A tool used in decision theory that visually models sequential decisions under uncertainty. It shows the possible outcomes, costs, and probabilities of different choices.
- Question: What is the Expected Monetary Value (EMV) in decision analysis? * Answer: A weighted average of all possible payoffs for a decision, where the weights are the probabilities of each outcome. It is the best decision criterion under risk.
- Question: How does the Expected Value of Perfect Information (EVPI) help a business decision-maker? * Answer: EVPI is the maximum amount a decision-maker should be willing to pay for perfect information (i.e., information that eliminates all uncertainty) before making a decision.
- Question: What is the difference between a Decision under Certainty and a Decision under Uncertainty? * Answer: Certainty: The decision-maker knows the exact outcome of each choice. Uncertainty: The decision-maker knows the possible outcomes but cannot assign probabilities to them (e.g., using criteria like maximin or maximax).
- Question: What is Sensitivity Analysis in a decision model? * Answer: The process of testing how changes in the input variables (e.g., probabilities, costs, or revenues) affect the optimal solution to determine how “sensitive” the final decision is to errors or changes in the estimates.
- Question: What is the purpose of the ANOVA (Analysis of Variance) test in quantitative analysis? * Answer: To determine if there are statistically significant differences between the means of two or more independent groups by comparing the variance between the groups to the variance within the groups.