Practice Math Statistics Problems and Master the Concepts

Alfredo Cooley 2022-11-06

I haven't used Bayes' theorem much before so any help would be greatly appreciated.
Suppose you are given following data.5% of the population have heart disease.
If you have heart disease, the probability that you have high blood pressure is 90%
If you do not have heart disease, the probability that you have high blood pressure is 15%
What is the probability that a person chosen at random from the population has high blood pressure?
$P (B) = P (B | H) P (H) + P (B | H^{'}) P (H^{'})$
$P (B) = (.9) (.05) + (.15) (.95) = .1875$
Using Bayes Theorem calculate the probability that the person has heart disease, if they have high blood pressure.
$P (H | B) = \frac{P (B | H) P (H)}{P (B)}$
$(.9) (.05) / .1875 = .24$
Using Bayes Theorem calculate the probability that the person has heart disease, if they do not have high blood pressure.
$P (H | B^{'}) = \frac{P (B^{'} | H) P (B^{'})}{P (H)}$
When I sub in for this part I'm getting an invalid answer

Aleah Avery 2022-11-06

Let $θ^{'}$ , $θ \in Θ$ such that $θ^{'} \neq θ$ .Prove that $T$ is a sufficient statistic if and only if
$\frac{f (x, θ^{'})}{f (x, θ)}$
is a function dependent only on $T (x)$ .

Adison Rogers 2022-11-06

Suppose $X_{n}$ is an i.i.d. random sample from the $N (μ, σ^{2})$ population, where μ is unknown but $σ^{2}$ is known. Consider a test statistic $T = \sqrt{n} (^{'} b a r X_{n} - μ_{0}) / σ$ at the significance level $α$ for $H_{0} : μ = μ_{0} v e r s u s H_{a} : μ \neq μ_{0}$ . Find Type I error of this test.
Confusion:
One of my friends gives this answer:
The hypothesis model will be
$H_{o} : μ = μ_{o}$
$H_{A} : μ \neq μ_{o}$
Type I error $α$ is the probablity of rejecting the null hypothesis $H_{o}$
Thus $p = α$
I think it's not correct, because Type I error is the the probablity of rejecting the null hypothesis $H_{o}$ when $H_{o}$ is true , and α means the size of test is no larger than $α$ , so I think there are some differences between these two concepts.
So who is wrong? why? and what the correct answer should be?

Aleah Avery 2022-11-06

Standard deviation of the mean of sample data
I can't quite understand what this formula means:
$σ_{\bar{x}} = \frac{σ}{\sqrt{n}}$
I know what standard deviation $σ$ is - it's the average distance of my data points (samples) from the mean. But this part is confusing:
For example, suppose the random variable X records a randomly selected student's score on a national test, where the population distribution for the score is normal with mean 70 and standard deviation 5 (N(70,5)). Given a simple random sample (SRS) of 200 students, the distribution of the sample mean score has mean 70 and standard deviation
$\frac{5}{\sqrt{200}} \approx \frac{5}{14.14} \approx 0.35$
Source
I thought the standard deviation $σ = 5$ means that if I take the scores of all students and calculate the mean, then the average distance of a score from that mean will be equal to 5. The set of all scores is called the 'population', right? But here it says the more students' scores I take, the lower the standard deviation - thus the closer the number of samples gets to the size of population, the lower the standard deviation (and its get further from 5).

Siemensueqw 2022-11-05

Determine the covariance and correlation for $X_{1}$ and $X_{2}$ in the joint distribution of the multinomial random variables $X_{1}, X_{2}$ and $X_{3}$ in with $p_{1} = p_{2} = p_{3} = \frac{1}{3}$ and n = 3. What can you conclude about the sign of the correlation between two random variables in a multinomial distribution?

Alberto Calhoun 2022-11-05

Suppose X and Y are two random variables such that $ρ (X, Y) = 1 / 2$ , Var X = 1, and Var Y = 2. Compute Var (X - 2Y).

Siemensueqw 2022-11-05

How do you find the exact values of $\sin θ$ and $\tan θ$ when $\cos θ = \frac{1}{2}$ ?

tramolatzqvg 2022-11-05

Let $X_{1}, X_{2}$ Poisson random variables with mean $μ$ . Is $T = X_{1} - X_{2}$ a sufficient statistic?

pin1ta4r3k7b 2022-11-05

Let $X_{1}, \dots, X_{n}$ be iid from a uniform distribution $U [θ - \frac{1}{2}, θ + \frac{1}{2}]$ with $θ \in R$ unknown. Show that the statistic $T (X) = (X_{(1)}, X_{(n)})$ is minimal sufficient but not complete.

charmbraqdy 2022-11-05

Getting confused over a T-test
its been a while since I've done this and I am getting rather confused.
Let's say I have two data sets of size $n_{1}$ and $n_{2}$
$X = X_{1}, X_{2}, . . X_{n_{1}}$
$Y = Y_{1}, Y_{2}, . . Y_{n_{2}}$
and want to construct a t-test. I have seen this formula in lots of books, what is the intuition and is this correct for finding a t-test?
$T = \frac{\bar{X} - \bar{Y}}{\sqrt{\frac{s_{1}^{2}}{n_{1}} + \frac{s_{2}^{2}}{n_{2}}}}$
I am getting confused a little, I see some books telling me to be careful is my variances are equal. Can i not just put them into this formula either way?
Also are we always referring to sample population, mean and standard deviation. How does the formula change if we have a the population data?
How does this formula change if the data is paired?

tramolatzqvg 2022-11-05

Give an example of stratified random sample and and a proportionate stratified random sample.

figoveck38 2022-11-05

A sample of households in a community is selected from the telephone directory. In this community, 4% of households have no telephone, 10% have only cell phones, and another 25% have unlisted phone numbers.
The telephone directory in the above setting is considered a...
a. contracted source
b. sampling frame
c. parameter level
d. descriptor

ritualizi6zk 2022-11-05

How to refine population statistic when more data is available
Suppose I have two pieces of data about two populations. The first piece of data is the national accident rate, denoted A The second piece of data is the national safety rate, a related, but not exactly inverse piece of data, denoted S
Now if I were to be given an additional piece of data, say a particular cities safety rating, and asked, what is the best guess of that cities accident rating, how would I approach this problem?
Also, I am not sure what this kind of situation/problem is called, if someone could point out the branch of statistics this falls under, that would also be helpful.

Barrett Osborn 2022-11-05

How to show mathematically max of uniform is a sufficient statistic?

MISA6zh 2022-11-05

If the baseline risk of a certain disease for nonsmokers is 1% and the relative risk of the disease is 5 for smokers compared to nonsmokers, what is the risk of the disease for smokers?

atgnybo4fq 2022-11-04

Determining sample size of a set of boolean data where the probability is not 50%
I'll lay out the problem as a simplified puzzle of what I am attempting to calculate. I imagine some of this may seem fairly straightforward to many but I'm starting to get a bit lost in my head while trying to think through the problem.
Let's say I roll a 1000-sided die until it lands on the number 1. Let's say it took me 700 rolls to get there. I want to prove that the first 699 rolls were not number 1 and obviously the only way to deterministically do this is to include the first 699 failures as part of the result to show they were in fact "not 1".
However, that's a lot of data I would need to prove this. I would have to include all 700 rolls, which is a lot. Therefore, I want to probabilistically demonstrate the fact that I rolled 699 "not 1s" prior to rolling a 1. To do this, I decide I will randomly sample my "not 1" rolls to reduce the set to a statistically significant, yet more wieldy number. It will be good enough to demonstrate that I very probably did not roll a 1 prior to roll 700.
Here are my current assumptions about the state of this problem:
- My initial experiment of rolling until success is one of geometric distribution.
- However my goal for this problem is to demonstrate to a third party that I am not lying, therefore the skeptical third party is not concerned with geometric distribution but would view this simply as a binomial distribution problem.
A lot of sample size calculators exist on the web. They are all based around binomial distribution from what I can tell. So here's the formula I am considering:
$n = \frac{N \times X}{X + N - 1}$
$X = \frac{{Z_{α / 2}}^{2} \times p \times (1 - p)}{{M O E}^{2}}$
n is sample size
N is population size
Z is critical value ( $α$ is $1 - c o n f i d e n c e l e v e l a s p r o b a b i l i t y$ )
p is sample proportion
MOE is margin of error
As an aside, the website where I got this formula says it implements "finite population correction", is this desirable for my requirements?
Here is the math executed on my above numbers. I will use $Z_{a / 2} = 2.58$ for $α = 0.01$ , $p = 0.001$ and $M O E = 0.005$ . As stated above, $N = 699$ on account of there being 699 failure cases that I would like to sample with a certain level of confidence.
Based on my understanding, what this math will do is recommend a sample size that will show, with 99% confidence, that the sample result is within 0.5 percentage points of reality.
Doing the math, $X = 265.989744$ and $n = 192.8722086653 \approx 193$ , implying that I can have a sample size of 193 to fulfill this confidence level and interval.
My main question is whether my assumption about $p = \frac{1}{1000}$ is valid. If it's not, and I use the conservative $p = 0.5$ , then my sample size shoots up to $\approx 692$ . So I would like to know if my assumptions about what sample proportion actually is are correct.
More broadly, am I on the right track at all with this? From my attempt at demonstrating this probabilistically to my current thought process, is any of this accurate at all?

Abdiel Mays 2022-11-04

In the example experiment on eye-witness memory of a car accident, if the question they used ("Was there shattered glass?") had been too easy and all participants got it right, this would have compromised the:
A. internal validity of the dependent variable
B. construct validity of the independent variable
C. internal validity of the independent variable
D. construct validity of the dependent variable

Madison Costa 2022-11-04

Why is the statistic $t = r \sqrt{\frac{n - 2}{1 - r^{2}}}$ $\approx t (n - 2)$ ?

charmbraqdy 2022-11-04

What is the difference between $\frac{1}{N} \sum \frac{y_{i}}{x_{i}}$ and $\sum \frac{\bar{y}}{\bar{x}}$
I have a data set and I was looking at the population ratio and trying to estimate it using different methods. I was expecting eq1 and eq2 to be different but was very surprised that it was by a factor of almost 3x. I was just wondering why it was so different, is it partly due to n cancelling on eq2. Any explanation would be appreciated.

Simone Watts 2022-11-04

Finding the $(X - \bar{X})^{2}$ of first 5 data of dataset given mean and population variance
Mean and population variance of the dataset $x_{1}, x_{2} . . x_{10}$ are 19 and 49 respectively. If the value $\sum_{i = 6}^{10} x_{i}^{2} = 1900$ , what is the value of $\sum_{i = 1}^{5} x_{i}^{2} = ?$ .
I've solved it as following and it is wrong:
Population variance:
$S^{2} = \frac{\sum (x_{i} - \bar{x})^{2}}{n} 49 = \frac{\sum_{i = 1}^{5} x_{i}^{2}}{10} + \frac{1900}{10} 49 = \frac{\sum_{i = 1}^{5} x_{i}^{2}}{10} + 190 - 190 + 49 = \frac{\sum_{i = 1}^{5} x_{i}^{2}}{10} \sum_{i = 1}^{5} x_{i}^{2} = - 141 * 10 = - 1410$
And this solution is wrong. How to solve this problem?

Improve Your Understanding of College Level Statistics Problems