This site is 100% ad supported. Please add an exception to adblock for this site.

stupidstats

Terms

undefined, object
copy deck
Golden standard for studies
randomized, controlled, double blind
three different types of experiments
controlled, randomized, double blind
confounder
a factor that differs between treatment and control groups
how do randomized trials minimize the effect of confounders?
they avoid bias in placement between the corntrol and treatment group
historical controls
for ethical placebo problems-NOT a Controled Experiment because the researchers did not assign the groups
CE vrs OS
ce\'s are better for drawing conclusions
confounders are associated with what? (2 things)
group membership and response
how do you control for confounders
compare smaller subgroups that are more homogenous. or use a weighted average.
simpson paradox
when a confounder has such a big effect that it reverses the outcome
three measures of central tendency
mean median, mode
what are thre measures of dispersion
SD, Range, IQR
what impact the outliers have on measures of central tendency
big impact on the mean but small on the mode, not big on the IQR
range
largest observation minus the smallest
IQR
75th percentile of the data minus the 25th percentile of the data
what are the axis of a histogram
X:sample value Y:percent per X value
distribuution
a histrogram who\'s intervals are very small and whoes sample size goes to infinity
what are the mean and sd of the standard normal plot
0 and 1
how can you examine wether a random sample is normally distributed
normal prob plot
discrete vrs continous variable
discrete is only the possible values(number of cookies in a jar). where as continous is any values in a range (1.5 1.7)
continuity correction
with a continous variable subtracting .5 and adding .5 can give a more accurate area. if the value you looking for is the area above 32 then (31.5 and 32.5, every thign inbetween those two values).....only use it when the variables are discrete and SD is small
box plot
summarizes a data set by plotting 5 numbers
outliers formula
1.5*IQR above 75 and below 25
correlation
measure of the strength og linera association
correlation (r)
how much information one of vairable has about the other. If its 1 then its perfect
Rsquare
porportion of the variation in y that is explained by x
why dont we need to minimize perpandicular distance to the regression line
because due to the equation all error occurs in the vertical direction
residuals
vertical distance between the data points and the line
RMSE
root mean sqaure error: describes the variablity of the points around the line
when can you make an infereance about an individual using the normal distrubution
histogram of residuals looks normal, and the residual plot shows no trend
mutually exclusive
it is impossible for both to happen at the same time and thus P(A or B)= PA + PB
benfords law
the leading digit is 1 about 1/3 the time
SD of a box
SQRT(p(1-p))

Deck Info

32

loklok

permalink