Brodie Beck

2022-09-07

Let's suppose a regression between earnings and age (and suppose I do not know the distribution of earnings). Would it be possible for the residuals to be normally distributed?

I am thinking it would not be possible since earnings only takes on positive values and since the support of the normal is from $-\mathrm{\infty}$ to $\mathrm{\infty}$, it would not be normal. However, since residuals are errors, they can be both positive and negative, so I am starting to question my hypothesis here.

I am thinking it would not be possible since earnings only takes on positive values and since the support of the normal is from $-\mathrm{\infty}$ to $\mathrm{\infty}$, it would not be normal. However, since residuals are errors, they can be both positive and negative, so I am starting to question my hypothesis here.

Tanya Anthony

Beginner2022-09-08Added 5 answers

If earnings are always positive then no, the residuals cannot be normally distributed, even though many may be negative: the magnitude of the negative residuals are bounded by the highest predicted earnings on the regression line.

That may not be the major issue: more important might be issues such as the skewness of earnings distributions at any age, or a non-linear relationship between earnings and age .

That may not be the major issue: more important might be issues such as the skewness of earnings distributions at any age, or a non-linear relationship between earnings and age .

Isaac Barry

Beginner2022-09-09Added 1 answers

You can always skew-zero transform a $y$-variable (earnings) if transforming skewed $x$-variables do not result in normally-distributed residuals. van Der Waerden scores would do a good job here, so to begin:

1. Determine percentile values, $pc{t}_{i}$, of each $y$-value based on rank position, $R({y}_{i})$, after an ascending sort.

2. Obtain the van der Waerden scores by plugging in the percentile values into the inverse CDF, i.e., ${Z}_{i}={\mathrm{\Phi}}^{-1}(pc{t}_{i})$

3. Then regress $Z$ on age, providing age is not skewed too much.

By definition, van der Waerden scores are mean-zero standard normal distributed, $\mathcal{N}(0,1)$, so the residuals should now be normally distributed.

To interpret the coefficient on age, just deconvolve.

1. Determine percentile values, $pc{t}_{i}$, of each $y$-value based on rank position, $R({y}_{i})$, after an ascending sort.

2. Obtain the van der Waerden scores by plugging in the percentile values into the inverse CDF, i.e., ${Z}_{i}={\mathrm{\Phi}}^{-1}(pc{t}_{i})$

3. Then regress $Z$ on age, providing age is not skewed too much.

By definition, van der Waerden scores are mean-zero standard normal distributed, $\mathcal{N}(0,1)$, so the residuals should now be normally distributed.

To interpret the coefficient on age, just deconvolve.

Read carefully and choose only one option

A statistic is an unbiased estimator of a parameter when (a) the statistic is calculated from a random sample. (b) in a single sample, the value of the statistic is equal to the value of the parameter. (c) in many samples, the values of the statistic are very close to the value of the parameter. (d) in many samples, the values of the statistic are centered at the value of the parameter. (e) in many samples, the distribution of the statistic has a shape that is approximately NormalConstruct all random samples consisting three observations from the given data. Arrange the observations in ascending order without replacement and repetition.

86 89 92 95 98.Find the mean of the following data: 12,10,15,10,16,12,10,15,15,13.

The equation has a positive slope and a negativey-intercept.

1) y=−2x−3

2) y=2−3x

3) y=2+3x

4) y=−2+3xWhat term refers to the standard deviation of the sampling distribution?

Fill in the blanks to make the statement true: $30\%of\u20b9360=\_\_\_\_\_\_\_\_$.

What percent of $240$ is $30$$?$

The first 15 digits of pi are as follows: 3.14159265358979

The frequency distribution table for the digits is as follows:

$\begin{array}{|cc|}\hline DIGIT& FREQUENCY\\ 1& 2\\ 2& 1\\ 3& 2\\ 4& 1\\ 5& 3\\ 6& 1\\ 7& 1\\ 8& 1\\ 9& 3\\ \hline\end{array}$

Which two digits appear for 3 times each?

A) 1, 7

B) 2, 6

C) 5, 9<br<D) 3, 8How to write

as a percent?$\frac{2}{20}$ What is the simple interest of a loan for $1000 with 5 percent interest after 3 years?

What number is 12% of 45?

The probability that an automobile being filled with gasoline also needs an oil change is 0.30; the probability that it needs a new oil filter is 0.40; and the probability that both the oil and the filter need changing is 0.10. (a) If the oil has to be changed, what is the probability that a new oil filter is needed? (b) If a new oil filter is needed, what is the probability that the oil has to be changed?

Leasing a car. The price of the car is$45,000. You have $3000 for a down payment. The term of the lease is and the interest rate is 3.5% APR. The buyout on the lease is51% of its purchase price and it is due at the end of the term. What are the monthly lease payments (before tax)?

The mean of sample A is significantly different than the mean of sample B. Sample A: $59,33,74,62,87,73$ Sample B: $53,67,72,57,93,79$ Use a two-tailed $t$-test of independent samples for the above hypothesis and data. What is the $p$-value?

What is mean and its advantages?