What if the variables you are studying are not normally distributed? Here is your challenge — if the population is not normal, can you make any inferences about that population from your random samples?

Pennies for Your Thoughts

In many of the problems from the last unit, you were given information about the population. For many of the variables, it was assumed the variable had a normal distribution. What if the variables you are studying are not normally distributed? Here is your challenge — if the population is not normal, can you make any inferences about that population from your random samples?

Main Post:

You are going to be creating histograms based on sample means of equal sizes. Click on this link to access the workbook for the discussion. The samples will be of size 5 (n = 5) and size 25 (n = 25). The histogram is going to have a bin width of 2.

Compare and contrast your sample with the population using the National Summary Statistics and Graphs Real Estate Data PDF document.

Generate a Representative Sample of the Data
Select a region and generate a simple random sample of 30 from the data.
Report the mean, median, and standard deviation of the listing price and the square foot variables.
Analyze Your Sample
Discuss how the regional sample created is or is not reflective of the national market.
Compare and contrast your sample with the population using the National Summary Statistics and Graphs Real Estate Data PDF document.
Explain how you have made sure that the sample is random.
Explain your methods to get a truly random sample.
Generate Scatterplot
Create a scatterplot of the x and y variables noted above and include a trend line and the regression equation

Observe patterns
Answer the following questions based on the scatterplot:
Define x and y. Which variable is useful for making predictions?
Is there an association between x and y? Describe the association you see in the scatter plot.
What do you see as the shape ?

If you had a 1,800 square foot house, based on the regression equation in the graph, what price would you choose to list at?

Do you see any potential outliers in the scatterplot?

Why do you think the outliers appeared in the scatterplot you generated?

What do they represent?

You can use the following tutorial that is specifically about this assignment. Make sure to check the assignment prompt for specific numbers used for national statistics and/or square footage. The video may use different national statistics or solve for different square footage values.

Evaluate normality using the appropriate test. In 30 words or less, report the statistical test and indicate what the results of the code reveal . Was the assumption violated? How can you tell?

Submit a Word-knitted version of the completed R Markdown file found in this zip file
First, as always, let’s visualise the data. In the code chunk for this question make the appropriate figure visualizing the attendance for each of the venues. Your figure should show each of the individual data points as well as a different geom displaying a summary for each venue, a colour scheme with different colours for each venue and the colours of the data points matching the geom, no legend, and of course a title, informative axis labels, and a nice theme.
There are several assumptions you could test, but for the sake of simplicity we will focus on the normality assumption . Evaluate normality using the appropriate test. In 30 words or less, report the statistical test and indicate what the results of the code reveal . Was the assumption violated? How can you tell?
Run the appropriate statistical test to evaluate the research hypothesis given Q2. Report on the results in 60 words or less. In your report, don’t worry about including descriptive statistics but do include an explanation of which statistical test you used, what the predictor and outcome variables were, the appropriate stats reference, the effect size and its interpretation, and the interpretation of this data in terms of the research question.
Perform post-hoc pairwise tests and in 40 words or less, report the tests and which venues had significantly lower attendance than others, along with their p-values. You do not need to report the non-significant venues or their p-values.

Interpret and validate statistics presented to the general population via the media. Find an example of statistics in the media.

Interpret and validate statistics presented to the general population via the media. Find an example of statistics in the media.

This could be a newspaper article, magazine article, a segment from a news broadcast or a digital news article. Below are some guidelines/questions you may use to structure your paper. Total length should be 1-2 pages, single spaced, of content plus a bibliography in MLA or APA format.

Explore data collection methods. Include a timeframe for data collection.

Research design and rationale for chosen design. :
Depending on the type of study to be conducted, a different tool/checklist should be used. To assist with research protocol preparation, the following checklists can be used as a template:
Tool / checklist name Hyperlinked abbreviated name Intended study type
Strengthening the Reporting of Observational studies in Epidemiology STROBE checklist Observational studies
Consolidated Standards of Reporting Trial CONSORT Statements Randomised controlled trials
Standards for Reporting Qualitative Research SRQR recommendations Qualitative Research
Standards for Quality Improvement Reporting Excellence SQUIRE guidelines Quality improvement studies

This should include details of how participants will be sampled. Inclusion and exclusion criteria. Explore how your will sample be accessed.

Explore data collection methods. Include a timeframe for data collection.
Data collection tools presented in an appendix will not be included in assessment word count.
Explore potential ethical considerations.

Why is important to do random sampling? What is regression fallacy? How may it apply to the relationships discovered?

Benchmark – Correlation and Regression Project

Use the following information to complete the questions below.
Use the following data points that have a linear relationship:
Substance Abuse and Suicide: Percent of the Total U.S. Population
X VARIABLE Y VARIABLE
Year Substance Use Suicides
1999 6.82 0.000105
2000 6.78 0.000104
2001 7.00 0.000107
2002 7.12 0.00011
2003 7.09 0.000108
2004 7.05 0.00011
2005 7.17 0.000109
2006 7.24 0.00011
2007 7.28 0.000113
2008 7.36 0.000116
2009 7.80 0.000118
2010 7.81 0.000121
2011 7.88 0.000123
2012 7.87 0.000125
2013 8.07 0.000126
2014 8.12 0.00013
2015 8.06 0.000133
2016 8.17 0.000134
2017 8.18 0.00014
2018 8.23 0.000142

In 500-750 words, address the following:

Identify the correlation coefficient for each of the possible pairings of variables. Describe the relationship in terms of strength  and direction .

Find a linear model of the relationship between the three variables of interest. Identify the predictor variables and the criterion variable.

Provide an output of the SPSS results and interpret the results using correct APA style. Be sure to include the following in your interpretation:
Cause and effect concepts
Independent/dependent variable relationships?
Why is important to do random sampling?
What is regression fallacy? How may it apply to the relationships discovered?

Examine the relationship between respondent’s health and social class . Treat social class as the independent variable and health as the dependent  variable.

Chi-Square SPSS activity

1. Are education degree and political views related?

a. Examine the relationship between respondent’s health and social class . Treat social class as the independent variable and health as the dependent  variable. Interpret results by reporting similarly to the above example. Be sure to report your results of the Chi-square value, degrees of freedom, p-value, and then your decision to reject the Null Hypothesis or not. Describe these findings in your own words.

b. Add SEX as a control variable and calculate. Is the relationship stronger for women or men? Why might this be?

Calculate the useful descriptive statistics, histograms for each variable if you like,a scatterplot of your data, and the Pearson correlation coefficient.

Any topic related to Psychology

Complete this assignment,:
1) Create a scenario that would require the use of the Pearson correlation coefficient. This means both variables must be interval or ratio scale data.
Examples include looking at:
• the relationship between the number of calories burned in a workout and the
amount of water weight lost during the workout
• the relationship between a person’s annual income and their happiness score
on your choice of a validated happiness metric
• The relationship between the number of hours of sleep people get per night
and your choice of one test of cognitive function or ratings of general physical
wellness . Google for
“validated test of…” to find examples or real scales or measurement tools.
2) write your research question and matching hypothesis
3) generate fake/mock data, enter that correctly into SPSS, Week 9 SPSS Discussion Instructions
4) Calculate the useful descriptive statistics, histograms for each variable if you
like,a scatterplot of your data, and the Pearson correlation coefficient. Verify that
your data meet the assumptions of using the Pearson correlation coefficient
by looking at your descriptive statistics.
5) and present your results in an infographic of your choice