Statistics+Chapter+2+Notes

flat =** Section 2.1: **= center:The three ways to find the center are through the mean(the average of the data) Mode (value with greatest frequency) and Median (middle of the order set.)I.e 1+2=3, average is 1.5, 1,2,2,3,4 2 is the mode. 1,2,3 2 is the median. variability:  spread in a variable or a probability distribution I.e Variance shape: distribution of the data within a dataset I.e J SHAPE intervals is an estimation of a population parameter. Temperature __#|classes__
 * Chapter 2 Statistics Notes **


 * lower class limit least number that belongs to the class i.e 1,6,11,16,21
 * upper class limit greatest number that can belong to a class i.e 5,10,15,20,25,30
 * class width the distance between lower or upper limits of consecutive __#|classes__ i.e 6-1=5

frequency distribution- a table that shows intervals of data entry’s frequency-the number of times the event occurred in anexperiment or study range- difference between maximum and minimum Explain Example 1 = 450-59=391, 391/7=55.86 sigma : is used to indicate a summation of values

midpoint: The midpoint of an interval. i.e. So in 1,2,3, the number 2 would be the midpoint?

relative frequency: Relative frequency is another term for proportion. The value calculated by dividing the number of times an event occurs by the total number of times an experiment is carried out

cumulative frequency: The cumulative frequency in a frequency distribution divided by the total number of data points.

Explain Example 2: Finding the midpoint, relative frequency, and the cumulative frequency. look at book.

frequency histogram: A bar graph that represents the frequency distribution of a data set. Properties of a frequency histogram:


 * 1) horizontal scale: It is quantitative and it measures the data values.
 * 2) vertical scale: It measures the frequencies of __#|the classes__.
 * 3) Consecutive bars must touch

Class boundaries: The numbers that separate classes without forming gaps between them. Subtract 0.5 from the first class lower boundary. Add 0.5 to the first class upper boundary.

frequency polygon:graphical device for understanding the shapes of distributions. They serve the same purpose as histograms

relative frequency histogram: uses the same information as a frequency histograms but compares to the total

cumulative frequency graph/ ogive: this is a line graph that displays cumulative frequencies. It is pretty straightforward. It compares the cumulative frequency vs. the boundaries of the statistic. Steps to construct an Ogive:


 * 1) construct a frequency distribution that includes cumulative frequencies as on of the columns
 * 2) specify the horizontal and vertical scales. The horizontal scale consists of upper class boundaries and the vertical scale measures cumulative frequencies.
 * 3) plot points that represent the upper class boundaries and their corresponding cumulative frequencies.
 * 4) connect the points in order from left to right.
 * 5) The graph should __#|start__ at the lower boundary of the first class and should end at the upper boundary of the last class.

=** Section 2.2: **=
 * stem-and-leaf plot: ** Stem-and-leaf plots are a method for showing the frequency with which certain __#|classes of__ values occur. You could make a frequency distribution table or a histogram for the values, or you can use a stem-and-leaf plot and let the numbers themselves to show pretty much the same information.


 * exploratory data analysis (EDA): ** exploratory data analysis is an approach to analyzing data sets to summarize their main characteristics in easy-to-understand __#|form__, often with visual graphs, without using a statistical model or having formulated a hypothesis.
 * 6 || 7,8 ||
 * 7 || 3,5,5,6,9 ||
 * 8 || 0, 0,2,3,5,7,8 ||
 * 9 || 0,1,1,2,4,5 ||


 * 6 ||  ||
 * 6 || 7,8 ||
 * 7 || 3 ||
 * 7 || 5,5,6,9 ||
 * 8 || 0,0,2,3 ||
 * 8 || 5,7,8 ||
 * 9 || 0,1,1,2,4 ||
 * 9 || 5 ||

test scores of 17 high school students: 69, 75, 77, 79, 82, 84, 87, 88, 89, 89, 89, 90, 91, 93, 96, 100, and 100. The stem-and-leaf takes all but the last digit of each score as the stem and uses the remaining digit as the leaf. As an example, for the score of 69, the 6 is the stem and the 9 is the leaf; for the next three grades (75, 77, and 79), 7 is the stem, and 5, 7, and 9 are the leaves.
 * stem: **
 * leaf: **

**Dot Plot:**
 * Pie Chart:**


 * Pareto chart: ** AKA Pareto diagram, Pareto analysis. A Pareto chart is a bar graph, and the length of the bars represent frequency or cost and is organized left to right. The few examples of when to use a Pareto chart is When analyzing data about the frequency of problems or causes in a process, When analyzing broad causes by looking at their specific components, and etc.

=**Section 2.3**=


 * paired data sets: ** when a data set is related to another data set.


 * scatter plot: ** Where points are graphed as points on a coordinate plane.


 * Example 6: Petal width also increases. **


 * time series: ** A data that is composed of quantitative entries taken at regular intervals over a period of time.


 * time series chart ** A chart used to graph time series


 * Example 7 ** The amount of precipitation measured each day for one month.

Measure of Central Tendency: A value that represents a typical or central, entry of a data set. (mean, median & mode) Mean: The sum of the data entries divided by the number of entries. To find the mean: Population Mean: __#|Sample__ Mean: Step 1: add all numerical numbers together __#|Step__ 2: divide the summation by the number of numerical numbers
 * SECTION 2.3**

Median: the numerical value separating the higher half of a sample; It is the middle number in a set of data

A median is found by detecting the middle number (17,16,18,19,20; the median would be 18) Or you can add all the number of entries together and divide by 2, and find that entry in the list. 25, 60, 80, 97, 100, 130, 140, 200, 220, 250 Median for the above set is the average of 100 and 130, which is 115.


 * Mode:** The Data entry that has the greatest frequency.
 * Bimodal:** If two entry's have the same greatest frequency, each entry is a mode and the data set is a bimodal.
 * Example 4:** __#|Flight prices__ are 388, 397, 397, 427, 432, 782, 872. mode=397 cause it occurs the most
 * Example 5:** Mode is republican cause they had the highest frequency.

Example 6 ** Comparing the mean, median, and mode ** Solve the mean, median, and mode for the age group of something.


 * Mean**: Y=[[image:https://docs.google.com/a/besanthill.org/drawings/d/sRmi59qOJzYSZTYMH-MrTQQ/image?w=20&h=20&rev=5&ac=1]]x /n = 475/20 = 23.8 years


 * median**: 21+22/2 = 21.5 years


 * Mode**: the entry occurring with the greatest frequency in 20 years.

an **Outlier** is a data entry that is far removed from the other entries in the data set.

46K ||
 * [[image:https://mail.google.com/mail/u/0/?ui=2&ik=101c52658f&view=att&th=13cb02ee532ed492&attid=0.1&disp=thd&realattid=f_hcunfofb0&zw caption="age6_01-eng.jpeg" link="@https://mail.google.com/mail/u/0/?ui=2&ik=101c52658f&view=att&th=13cb02ee532ed492&attid=0.1&disp=inline&realattid=f_hcunfofb0&safe=1&zw"]] || **age6_01-eng.jpeg**


 * Weighted Mean** -Is the mean of a data set whose entries have varying weights. A weighted mean is given by


 * Example 7** Finding a Weighted mean When taking a class in which your grade is determined by five sources 86, 96, 82, 98 100. Each grade is gradely differently. The 86 weight is of 50%, 96 is 15% is 20%, 98 is 10% and 100 is 5%. The solution would do use the formula . X.W, which equivalent for the grade times its weight. You do this for each score and you get the total amount of X.W. Then the w is just the total amount of weight, or percentage value.. So the result is going to be that the weighted mean is 88.6/1=88.6.


 * Mean of frequency distribution** for a sample is approximated by E(x.f)/n, noting than N=Ef.


 * Example 8** In Example 8 it asked you to find the mean number of minutes that a sample of Internet subscribes spent online during thir most recent session. The class midpoint (which is found by adding the loer and upper limit divided by 2) 12.5,24.5,36.5,48.5,60.5,72.5,84.5 And the frequency, F are 6,10,13,8,5,6,2. The solution is to use the formula. So the E was The midpoint times the frequency and nN was the total sum of the frequencies put together. So the mean of the frequency distribution was 2089/50+41.8

Noah Pg. 71

Symmetric: When you draw a line down the middle of the of the graph and it creates mirror images Uniform (or rectangular): All entries in the frequency distribution are equal oor have approximately equal frequencies. note... it is ALSO symmetric. A SKEWED graph is when the tail end of the graph head in a certain direction, rather than the other. Like a graph that skews left extends to the left, and is negatively skewed. If it is to the right, it extends to the right, and is positively skewed.

=**Chapter 2.4**= Range: Maximum and Minimum data entry Deviation: the difference between the entry and the mean of the data set.
 * Page 80 & 81**

Population Variance: something found by calculating the mean of the squared values


 * page 82: Population standard deviation:** a population data set of N entries is the square root of the population variance.





** POPULATION VARIANCE and standard varian **
Sample Variance Sample Standard Deviation: How do you do it? Listen to her! media type="youtube" key="-asDKTilfjs" height="315" width="560"
 * Page 83: **


 * Empirical Rule:**

P.G 84 Shon Instructions on how to calculate the sample mean and sample standard deviation. On a TI-83/84 Press **STATS** Choose the edit menu 1:Edit Enter the sample office rental rates into l1 Choose the CALC menu 1:VAR STATS
 * Stat**
 * ENTER**
 * 2ND** l1 **ENTER**

media type="youtube" key="uMgK000XFhA" height="315" width="420"
 * Chebychev's Theorem:**

=**​Section 2.5:**=


 * Page 100:**
 * Quartiles: Three special percentiles which divide the data into four groups of equal size.**
 * - First Quartile: is the 25th percentile or 0.25 fractile.**
 * - Second Quartile: is the 50th percentile or 0.50 fractile.**
 * - Third Quartile: is the 75th percentile or 0.75 fractile.**
 * THERE IS NO FOURTH QUARTILE.**


 * How to find the upper and lower quartile:**

media type="youtube" key="psALBCNC62I" height="315" width="420"

A curve or geometric figure, each part of which has the same statistical character as the whole.
 * Fractile:**

Noun
**fractile** (//plural// ** [|fractiles]  ** )
 * 1) ( statistics ) The value of a distribution for which some fraction of the sample lies below.The //q//-quantile is the same as the //(1/q)//-**fractile**.The median is the .5-**fractile**.

media type="youtube" key="DGAXeX42eoE" height="315" width="560"

**Using technology to find quartiles:**
Steps: 1. Go to stat, and it is already on edit so hit enter. 2.type in your data/ numbers into the L1 section 3. hit 2nd, then quit. 4. go to Stat, but this time hit the arrow button over to Calc (so one over) and it is alread y on 1-Var stats so hit enter. 5. press the arrow button down twice then hit enter, or hit enter three times 6. hit the arrow button down 5 times, and you will get to the information you need (it shows what you should see on page 101 under TI-83/84 PLUS)

PAge 104

**- A z-score can be negative, positive, or zero.**
PG. 106: INTERPRETING Z-SCORES

"Z-Scores tell us whether a particular score is equal to the mean, below the mean or above the mean of a bunch of scores. They can also tell us how far a particular score is away from the mean." The grades on a spanish midterm at Gardner Bullis are normally distributed with μ = 83 and σ = 3.5. Gabriela earned an 82 on the exam. Find the z-score for Gabriela's exam grade. Round to two decimal places. A z-score is defined as the number of standard deviations a specific point is away from the mean. We can calculate the z-score for Gabriela's exam grade by subtracting the <span class="hint_pink">mean <span class="mo" style="font-family: MathJax_Main; font-size: 19.5167px;">( <span class="mi" style="font-family: MathJax_Math; font-size: 19.5167px;">μ <span class="mo" style="font-family: MathJax_Main; font-size: 19.5167px;">) from her grade and then dividing by the <span class="hint_green">standard deviation <span class="mo" style="font-family: MathJax_Main; font-size: 19.5167px;">( <span class="mi" style="font-family: MathJax_Math; font-size: 19.5167px;">σ <span class="mo" style="font-family: MathJax_Main; font-size: 19.5167px;">). <span class="mi" style="font-family: MathJax_Math; font-size: 23.4167px;">z <span class="mo" style="font-family: MathJax_Main; font-size: 23.4167px;">= <span class="mi" style="font-family: MathJax_Math; font-size: 23.4167px;">x <span class="mo" style="font-family: MathJax_Main; font-size: 23.4167px;">− <span class="mi" style="color: #ff00af; font-family: MathJax_Math; font-size: 23.4167px;">μ <span class="mi" style="color: #28ae7b; font-family: MathJax_Math; font-size: 23.4167px;">σ <span class="mi" style="font-family: MathJax_Math; font-size: 23.4167px;">z <span class="mo" style="font-family: MathJax_Main; font-size: 23.4167px;">= <span class="mn" style="font-family: MathJax_Main; font-size: 23.4167px;">82 <span class="mo" style="font-family: MathJax_Main; font-size: 23.4167px;">− <span class="mn" style="color: #ff00af; font-family: MathJax_Main; font-size: 23.4167px;">83 <span class="mn" style="color: #28ae7b; font-family: MathJax_Main; font-size: 23.4167px;">3.5 <span class="mi" style="font-family: MathJax_Math; font-size: 23.4167px;">z <span class="mo" style="font-family: MathJax_Main; font-size: 23.4167px;">=− <span class="mn" style="font-family: MathJax_Main; font-size: 23.4167px;">0.29 The z-score is <span class="mo" style="font-family: MathJax_Main; font-size: 19.6833px;">− <span class="mn" style="font-family: MathJax_Main; font-size: 19.6833px;">0.29. In other words, Gabriela's score was <span class="mn" style="font-family: MathJax_Main; font-size: 19.6833px;">0.29 standard deviations below the mean.