Course Note

Essay by Song Ye • February 26, 2017 • Course Note • 1,346 Words (6 Pages) • 1,191 Views

Essay Preview: Course Note

prev next

Report this essay

Page 1 of 6

Basic Definitions

Experimental unit（样本单位）） – an object (e.g., person, thing, transaction, or event) upon which we collect data
Population （总体）– set of units that we are interested in studying
Sample (样本)– subset of units of a population
Variable （变量）– a characteristic or property of an individual experimental unit
Statistical Inference （统计推断）– making an estimate or prediction or some other generalization about a population based on information contained in a sample

Elements of Descriptive Statistics

(1) Data set (population or sample), (2) variable of interest, (3) graphs or numerical measures, (4) conclusions about the data pattern

Elements of Statistical Inference

(1) Population, (2) variable of interest, (3) sample, (4) inference, (5) measure of reliability

Types of data

1) Quantitative data – can be described numerically

Age, height, weight, size of family, income, GDP, CPI, Stock market average, monthly sales revenue
Quantitative Data: Cross-sectional / Time Series

2) Qualitative data – not inherently numerical

Also called categorical data or attributes data
Color of eyes, accurate / not, left or right handed, yes/no variables, employment status, defect/no defect, occupation code…

Collecting Data & Sampling Techniques

Published source
Designed experiment
Observational study (Survey)
Random sample selected from the target population of interest
Selection bias – a subset of the experimental units in the population are excluded from the sample
Nonresponse bias – researchers are unable to obtain data on all experimental units selected for the sample。（Solution – sample the non-respondents to determine characteristics of non-respondents.
）

Sampling Designs

1) Stratified random sample

Split up the population into strata
Randomly sample from each strata
More representative sample?
Dependent on the researcher’s strata

2) Systematic random sampling

Take every kth item
Useful for processes
Watch out for systematic sampling biases (cycles)
Security lines at airports - TSA

3) Cluster and multi-stage clustering

Cluster the population into subpopulations
Randomly select clusters to get to the elements of the population

4) Convenience samples – sample elements are selected that are convenient to the researchers

Types of Sample Design Errors

Sampling error（抽样误差） – difference between the estimator (sample statistic) and the true population parameter.

Due to sample vs. population

抽样方法本身所引起的误差。当由总体中随机地抽取样本时，哪个样本被抽到是随机的，由所抽到的样本得到的样本指标x与总体指标μ之间偏差，称为实际抽样误差。当总体相当大时，可能被抽取的样本非常多，不可能列出所有的实际抽样误差，而用平均抽样误差来表征各样本实际抽样误差的平均水平。
Nonsampling (measurement) error（非抽样误差） – all other errors that cause a difference between an estimator and a population parameter.
非抽样误差是指除抽样误差以外所有的误差的总和。应该说非抽样误差的产生贯穿了市场调查的每一个环节，任何一个环节出错都有可能导致非抽样误差增加而使数据失真。我们平时说的控制误差主要指的就是控制非抽样误差。

Poor sampling design
Interviewer errors / interviewer biases
False information provided by respondent
Poorly worded or loaded questions
Data errors
Undercoverage
Non respondents

Descriptive Statistics

Four things to know about any distribution

1) Measures of Location (Central Tendency)

Midrange

[pic 1]

Mode（众数）
Median（中间数，比Median大的有50%，比Median小的有50%）
Mean

[pic 2]

trimmed mean：mean of the middle x% of the data

2) Measures of Variability (Dispersion)

Range = [max minus min]
interquartile range ：Qu - QL
Upper (3rd) Quartile – Lower (1st) Quartile
75th percentile - 25th percentile
‘Upper management’ - range of the middle 50% of the distribution
Qu = 3/4 (n + 1) Round to nearest integer
QL = 1/4 (n + 1) Round to nearest integer

Box plot
IQR is the box, median is the line in the box
Hinge points are at the edges of the box
QU and QL
Inner fence: Hinge point +/-1.5 (IQR)
Suspect outliers
Outer fence: Hinge point +/- 3 (IQR)
Designated as highly suspect outliers
Whiskers – lines to the edges of the inner fence

Variance（方差）: population / sample
Population variance

[pic 3]

Sample variance

[pic 4]

Computational equation for sample variance

[pic 5]

Tells how far on average each value is away from the mean
standard deviation
Population: σ Sample: s

[pic 6] [pic 7]

coefficient of variation （变异系数，比较两组数据离散程度大小）

[pic 8]

3) Shape

Symmetry / skewness / mathematical form
Mound-shaped Distributions-Use the Empirical Rule
z-score（标准分数）:

[pic 9] [pic 10]

[pic 11]

Approximately 68% of the data is within 1 standard deviation
Approximately 95% of the data is within 2 standard deviations
Approximately 99.7% (essentially all) of the data is within 3 standard deviations
z > 2 = possible outlier, z > 3 = outlier.
For any shaped distribution-Chebyshev’s Inequality（切比雪夫不等式）,k是标准差的个数。

[pic 12]

4) Data patterns (for time series data)

Time series

Graphical Techniques

Bar chart

Vertical axis：frequency, relative frequency
Horizontal axis：variable of interest (Xi)

Pareto diagram - Bar chart with bars ordered by frequency (highest to lowest)

Random Variables and Probability Distributions（随机变量与概率分布）

Types of Random Variables-Discrete / Continuous

Discrete – random variable that can only take on a finite number of values (countable)

Number of defects per product, occupation code, type of failure, reason for customer return, type of customer complaint

Continuous – random variable that can take on any infinite value within an interval (measurement)

Wait time at a fast-food window, strength of a laptop case, response time of a computer system

...

Download as: txt (8 Kb) pdf (744.9 Kb) docx (222.5 Kb)

Continue for 5 more pages »

Read Full Essay Save

Only available on Essays24.com

Similar Essays

Bio Notes

REVIEW UNIT 1-THE CELL LIFE *There is no def. for life *Life can be characterized by a list of things that we might agree living

928 Words | 4 Pages
I-Notes Aren'T So Bad

I-Notes Aren't so Bad In the article, "Disappearing Ink", Todd Gitlin discusses StudentU.com, a website that sells lecture notes from over 60 different universities. These

528 Words | 3 Pages
Noted Culture

At this moment on every college campus, in many cars, in elevators, on sofas all over the world, and even on airplanes people are all

480 Words | 2 Pages
Nickel And Dimed Notes

This chapter (Selling in Minnesota) had some disturbing information about the low wage life. As I read, I learned that every place the author went

450 Words | 2 Pages
Sex Paraphilia Notes

Paraphilias CH16 get encyclopedia of unusual sexual practices When is Sexual Behavior Abnormal Ñ"Ð¬ Varies by _culture __ Ñ"Ð¬ Various approaches have been proposedÐŽXstatistical (rare)

1,350 Words | 6 Pages
The Million Dollar Blank-Note (Mark Twain)

Dear Son I know you are probably wondering what this letters concerning but I anted you to know how I came to be the man

526 Words | 3 Pages
Notes On The Metamorphosis

Summary Gregor Samsa awakes one morning to find that he has been inexplicably transformed into a giant insect. He has also slept late. His parents

3,333 Words | 14 Pages
Word Biblical Commentary Notes

Word Biblical Commentary Exodus Chapter 20 vs. 1-2- And God spoke all these words: I am the Lord your God who brought you out of

1,409 Words | 6 Pages
Robert Schumann Essay Notes

1810-1856 Born in Germany Son of a bookseller who loved literature Mother made him study law Wanted to be a pianist Finally convinced his mother

321 Words | 2 Pages