Thursday, November 12, 2015

Give examples for the following statistic variables.

- nominal variables
- ordinal variables
- interval variables
- ratio variables

# nominal: gender(male vs female) # ordinal: ranking (very fast, fast, slow) # interval: geographic longitudes # ratio: absolute zero (temperature)

What is z-score? What can it be used for?

# makes the distributions standardized # z-score can be used to find percentile rank

In R, what are the function names for z-score, median, minimum and mode?

# scale() # median() # min() # sorted_freq <- sort(table(Height), decreasing=T) # max_freq <- sorted_freq[1] # names(max_freq)

What is the mean of a z-score?

# 0

What is a **positive z-score**?

# a positive z-score is above average

Why do we need to calculate variability?

# to find out whether data is scarse or not

- Load the library
**datasets** - We would use the data
**trees**again

require(datasets) str(trees)

## 'data.frame': 31 obs. of 3 variables: ## $ Girth : num 8.3 8.6 8.8 10.5 10.7 10.8 11 11 11.1 11.2 ... ## $ Height: num 70 65 63 72 81 83 66 75 80 75 ... ## $ Volume: num 10.3 10.3 10.2 16.4 18.8 19.7 15.6 18.2 22.6 19.9 ...

- [a] What is the maximum value within column
*Height*? - [b] What are the median and variance values within column
*Volume*? - [c] Within
*Girth*column, extract values that are larger than 12.0 (not included). Scale the extracted values. - [d] Within
*Volume*column, extract values that are smaller than 31.7 (included). Plot a scaled histgram and add a vertical*red*line to x-axis value equals 0. - [e] Within
*Height*column, which one is the mode?

attach(trees) max(Height) median(Volume) var(Volume) scale(Girth[Girth > 12.0]) v <- scale(Volume[Volume <= 31.7]) hist(v) abline(v=0, col='red')

names(sort(-table(Height))[1]) detach(trees)