stupidstats
Terms
undefined, object
copy deck
- Golden standard for studies
- randomized, controlled, double blind
- three different types of experiments
- controlled, randomized, double blind
- confounder
- a factor that differs between treatment and control groups
- how do randomized trials minimize the effect of confounders?
- they avoid bias in placement between the corntrol and treatment group
- historical controls
- for ethical placebo problems-NOT a Controled Experiment because the researchers did not assign the groups
- CE vrs OS
- ce\'s are better for drawing conclusions
- confounders are associated with what? (2 things)
- group membership and response
- how do you control for confounders
- compare smaller subgroups that are more homogenous. or use a weighted average.
- simpson paradox
- when a confounder has such a big effect that it reverses the outcome
- three measures of central tendency
- mean median, mode
- what are thre measures of dispersion
- SD, Range, IQR
- what impact the outliers have on measures of central tendency
- big impact on the mean but small on the mode, not big on the IQR
- range
- largest observation minus the smallest
- IQR
- 75th percentile of the data minus the 25th percentile of the data
- what are the axis of a histogram
- X:sample value Y:percent per X value
- distribuution
- a histrogram who\'s intervals are very small and whoes sample size goes to infinity
- what are the mean and sd of the standard normal plot
- 0 and 1
- how can you examine wether a random sample is normally distributed
- normal prob plot
- discrete vrs continous variable
- discrete is only the possible values(number of cookies in a jar). where as continous is any values in a range (1.5 1.7)
- continuity correction
- with a continous variable subtracting .5 and adding .5 can give a more accurate area. if the value you looking for is the area above 32 then (31.5 and 32.5, every thign inbetween those two values).....only use it when the variables are discrete and SD is small
- box plot
- summarizes a data set by plotting 5 numbers
- outliers formula
- 1.5*IQR above 75 and below 25
- correlation
- measure of the strength og linera association
- correlation (r)
- how much information one of vairable has about the other. If its 1 then its perfect
- Rsquare
- porportion of the variation in y that is explained by x
- why dont we need to minimize perpandicular distance to the regression line
- because due to the equation all error occurs in the vertical direction
- residuals
- vertical distance between the data points and the line
- RMSE
- root mean sqaure error: describes the variablity of the points around the line
- when can you make an infereance about an individual using the normal distrubution
- histogram of residuals looks normal, and the residual plot shows no trend
- mutually exclusive
- it is impossible for both to happen at the same time and thus P(A or B)= PA + PB
- benfords law
- the leading digit is 1 about 1/3 the time
- SD of a box
- SQRT(p(1-p))