Project Overview
Executed a comprehensive statistical inference analysis on equity earnings and price distributions, covering two-tailed and one-tailed t-tests, chi-squared variance testing, confidence interval construction, and linear regression. These techniques form the foundation of quantitative validation in equity research, risk management, and empirical finance.
📋Problem Statement
Validate distributional assumptions about a stock universe: determine whether population mean earnings and prices match hypothesized benchmarks, test whether earnings variance is consistent with expectations, and quantify the linear relationship between behavioral factors and performance outcomes.
🎯Analytical Approach
Applied classical parametric inference to a sample of 50 equity observations. Used R's built-in t.test() for location tests, manual chi-squared statistic computation for variance testing, and lm() for linear regression. Each test follows the formal hypothesis-testing protocol: state H0/HA, choose α, compute test statistic, compare to critical value, draw conclusion.
💾Data & Variables
Sample of 50 equity observations with two variables: earnings per share and stock price. A separate dataset contains daily TV-viewing hours and quiz scores for regression analysis. All data imported from Excel using read_excel(). Descriptive statistics computed before formal inference.
🔧Methods & Tests
Two-tailed t-test (H0: μ_earnings = 5): t = 1.04, p = 0.303. Confidence interval test (H0: μ_price = 50): 95% CI = [55.64, 78.95] → rejects H0. p-value t-test (H0: μ_price = 80): t = −2.18, p = 0.034. One-sided t-test (H0: μ_earnings ≤ 4): t = 2.31, p = 0.012. Chi-squared variance test (H0: σ²_earnings = 2.5): statistic falls outside critical region → reject H0. OLS regression (score ~ TV_hours): β = −4.302, R² = 0.7443.
✨Key Results
Of five formal tests: four reject the null hypothesis at α = 0.05. Mean price = $67.30 (significantly different from both $50 and $80). Mean earnings > $4 confirmed with p = 0.012. Earnings variance significantly differs from 2.5. TV-hours regression: each additional hour reduces quiz score by 4.3 points; R² = 0.74 indicates a strong linear relationship.
🧠Key Learnings
The choice of one-tailed vs two-tailed test critically affects power and conclusion — one-tailed tests detect directional alternatives more efficiently when the researcher has an a priori expectation. Chi-squared variance tests are sensitive to normality violations; robust alternatives (Levene's, Brown-Forsythe) should be considered. High R² in simple regression does not validate causation — omitted variable bias and reverse causation require separate treatment.