Category: Statistics
-
Statistical analysis and data interpretation
In 1-3 sentences, define each of the following terms in your own words and provide an example: Correlation coefficient Bold text start(2 marks)Bold text End Linear regression Bold text start(2 marks)Bold text End Small sample bias Bold text start(2 marks)Bold text End Common cause relationship Bold text start(2 marks)Bold text End You are hired to study the statistical relationship between the number of trees and the number of Starbucks in different neighbourhoods of a city. Describe how you would obtain this information and two biases that could occur in your survey. How would you avoid these biases? Bold text start(5 marks)Bold text End The next 3 questions refer to the following data set. Show any relevant calculations or spreadsheet formulas that you used while responding to the questions. Years 0.9 2.0 2.1 4.0 4.8 5.2 6.2 9.0 9.1 Height (cm) 0.79 3.98 4.81 11.41 29.72 24.2 26.81 91.47 129.45 Use automatic linear regression (e.g. using Excel or Sheets) to find a line of best fit, with the Years as the x values and the Height (cm) as y values. Explain exactly what you did. Bold text start(4 marks)Bold text End Find the correlation coefficient and coefficient of determination for the data. Assess how well the line fits the data. Bold text start(5 marks)Bold text End Create a residual plot for the data using your line of best fit. What conclusions can you draw from the residual plot? Bold text start(4 marks)Bold text End For the next 5 questions, use the internet to find a reliable time series data set with at least 10 years of data, and at least 10 data points. State the source of the data, and explain how you know it is a reliable source. Bold text start(4 marks)Bold text End Perform a linear regression and interpret the results. Bold text start(5 marks)Bold text End Using the regression results from the previous question, estimate the value that would occur 10 years before your first data point, and 10 years after your last data point. How valid are these predictions? Bold text start(6 marks)Bold text End Create a residual plot and comment on the appropriateness of the model. Bold text start(5 marks)Bold text End Even though the source of the data is reliable, describe at least two biases or mistakes that could have been made in collecting the data that would affect any conclusions. Include questions you may ask to investigate these potential issues. Bold text start(6 marks)Bold text End The next 2 questions refer to the following data set. Show any relevant calculations or spreadsheet formulas that you used while responding to the questions. x 1 2 3 4 5 6 7 8 9 y 7.2 6.2 2.4 17.0 35.1 60.0 96.6 133.5 180.1 Create a scatter plot of the data, and create a residual plot for a line of best fit. Based on your observation, does the data follow a linear pattern? Bold text start(5 marks)Bold text End While we used a line of best fit to calculate residuals earlier, you can actually use any model to calculate residuals and create a residual plot. Use the estimate to calculate a new set of residuals, using a formula or a calculator. Explain your steps, then create a residual plot using this new model and assess its validity. Bold text start(6 marks)Bold text End For each of the following relationships, predict whether they are cause-effect, accidental, or common-cause. Explain your conclusion and, if they are cause-effect, explain which causes the other and why. Students with a shorter travel time to school have lower test scores. Bold text start(3 marks)Bold text End Children with more books in their home are more likely to earn a PhD when they grow up. Bold text start(3 marks)Bold text End People born between the 15th and 25th of any month are more likely to have a cell phone number ending in 26. Bold text start(3 marks)Bold text End People who drink more coffee in the morning are more likely to have insomnia (the inability to sleep) at night. Bold text start(3 marks) -
Describing Data
Find two peer-reviewed sociological articles on a sociological topic of interest that use descriptive statistics covered in this course (frequencies, crosstabulations, measures of central tendency and variability, probability or the normal distribution).
For each article:
- Describe the research context (topic, question, purpose)
- Describe the data (For example, survey, census, etc.) and variables used
- Identify the descriptive statistics presented (including visuals)
- Evaluate how effectively they summarize the data.
Focus only on describing data, not hypothesis testing or causal inference.
Attached Files (PDF/DOCX): SOC – 3316 (7).pdf, SOC – 3316 (6).pdf
Note: Content extraction from these files is restricted, please review them manually.
-
STT 503: Quantitative Business Analysis
I would like to complete the assignment without plagiarism or AI-generated content. Please review the files and simplify the language, as English is my second language.
Requirements:
-
Gym membership gender analysis
Instructions REMEMBER: You will be submitting 2 files – an Excel file with your work AND a Word file with your report. You will receive a zero (0) for the assignment if you don’t submit both parts! Your report should contain the Excel tables, graphics, and statistics you are asked to create by using copy/paste functionality from Excel to Word. You can also use “snipping tool” as an alternative for copying from Excel to Word. Use the Excel file M06 Business Application – AJ FitnessDownload M06 Business Application – AJ Fitness Part 1 The owner of AJ Fitness center is interested in whether men or women visit the gym more regularly and by how much. Sample data are in the Excel file linked above. Use the steps below to find the answer. Use a pivot table to find the distribution of typical visits per week grouping by 1 visit. Use the pivot table to create a histogram. Does the data appear to be normally distributed? Report the average value of typical visits per week by gender. Answer the question: Which gender visits the gym more per week? By how much? Report the standard deviation and sample size of typical visits per week by gender. Calculate a 95% confidence interval for the difference in typical visits per week for men and women. Assume equal variances. Interpret you confidence interval. Using your confidence interval results, how would you answer the question, “Is there a significant difference in the number of typical visits per week for male and female members with the club?” Show work either using Excel commands or using the Excel CALCULATOR Part 2 The owner also wishes to see if there is a significant difference in the average age of men and women who attend the gym. Here are the steps to test whether there is a significant difference. Report the average value of age by Answer the question: Which gender is older on average? By how much? Report the standard deviation and sample size of age by Test whether there is a significant difference in the average age of men and women who attend the gym. Assume data is normally distributed with equal variances. Use alpha = 0.10 level of significance and assume equal variances. State the hypotheses Give the test statistic and p-value State your conclusion Show work either using Excel commands or using the Excel CALCULATOR Part 3 The owner of AJ Fitness also want to answer the question: Do men or women tend to stay with the club for more years? Follow these steps to answer that question. Use a pivot table to count the number of members with 0-5 and 5-10 years with the club by gender. You will need to group the years with the club in your pivot table. Using your counts, find the proportion of men who have been with the club 5 or more years and find the proportion of women who have been with the club 5 or more years. Test whether there is any difference between the proportion of men with 5 or more years and the proportion of women with 5 or more years using alpha is 0.05 level of significance. State the hypotheses Give the test statistic and p-value State your conclusion Show work either using Excel commands or using the Excel CALCULATOR Submit a Word document of your written report. Use complete sentences and thoughts. Consider writing a paragraph for each part of this assignment. Report in your Word document all your findings using the calculations, tables, and graphics from Excel. You can copy and paste from Excel into Word. Create a report worthy of submission to management. Submit your Excel file demonstrating your work using the appropriate Excel commands or Excel Calculators for full credit.Attached Files (PDF/DOCX): MO6 Printable Business Application and Rubric.docx
Note: Content extraction from these files is restricted, please review them manually.
-
Null hypothesis testing and statistical errors
600 word essay (2-3 pages) with 7-10 scholary peer review articles, APA 7, double spaced, The Book is called Applied Statistics I: Basic Bivariate Techniques by Rebecca Warner, 2021. ESSAY: THE NULL HYPOTHESIS AND YOU ASSIGNMENT PROMPTS Provide an example of a situation where a researcher would utilize a directional significance test. What factors contribute to the value of and when a researcher reports a p value, what are they reporting? Do we typically want a p value to be small or large? Provide two scenarios: one where a Type I error occurs and another where a Type II error occurs. What factors influence the magnitude of risk for each of these and what practices might a researcher engage to minimize these risks? If the null hypothesis is incorrectly reported as significant, which type of error is occurring and what might the subsequent implications be? What conclusions, if any, can be drawn from such results?Attached Files (PDF/DOCX): Essay Grading Rubric (1).pdf
Note: Content extraction from these files is restricted, please review them manually.
-
Week 4 Jasp file
Load the attached file in JASP and perform the proper test to determine if there is a significant difference in the Degree of Reading Power for the control group and the treatment group to determine if the Directed Reading Activities treatment was successful. The variable descriptions and research overview are attached below.
First, use JASP to determine if the assumptions have been met.
- Use the Shapiro-Wilk test to determine if the data are normally distributed.
- Use Levene’s Test to test for homogeneity of variances.
- Based on the results of normality and homogeneity of variances, choose the correct statistical technique to test the hypothesis that there is a significant difference between the control and treatment groups.
Perform the correct test of means and include the location parameter with 95% confidence interval, the effect size, descriptives, descriptive plots, and raincloud plots.
Add your name to the module and use the results to answer the following questions. Round your answers to three decimal places as they are in JASP.
Q1a. What is the W-statistic for the Shipiro-Wilk?
Q1b. What is the F-statistic for Levene’s test?
Q1c. What is test statistic for the hypothesis test?
Q1d. What is the mean difference for the control and treatment groups?
Q1e. Based on the results, was the Directed Reading Activity successful (yes/no)?
Export your results to pdf and upload the file in Question 2.
Question 2
25 pts
Export your results from Question 1 to pdf and upload the file here.
Upload
Question 3
25 pts
Load the attached file in JASP and perform the proper test to determine if Recall is significantly greater for Fixation (Group 1) than for Horizontal Eye Movement (Group 2). The variable descriptions and research overview are attached below.
First, use JASP to determine if the assumptions have been met for the proper test of means.
- Use the Shapiro-Wilk test to determine if the data are normally distributed.
- Use Levene’s Test to test for homogeneity of variances.
- Based on the results of normality and homogeneity of variances, choose the correct statistical technique to test the hypothesis that there is a significant difference between the control and treatment groups.
Perform the correct test of means and include the location parameter with 95% confidence interval, the effect size, descriptives, descriptive plots, and raincloud plots.
Add your name to the module and use the results to answer the following questions. Round your answers to three decimal places as they are in JASP.
Q1a. What is the F-statistic for Levene’s test?
Q1b. What is the test statistic for the hypothesis test?
Q1c. What are degrees of freedom for the test statistic?
Q1d. What is the mean difference for the control and treatment groups?
Q1e. What is the effect size?
Export your results to pdf and upload the file in Question 4.
Question 4
25 pts
Export your results from Question 3 to pdf and upload the file here.
Upload
Attached Files (PDF/DOCX): Directed Reading Activities – Week 4 Question 1.docx
Note: Content extraction from these files is restricted, please review them manually.
-
Module 6 Discussion
The B&K Real Estate Company sells homes and is currently serving the Southeast region. It has recently expanded to cover the Northeast states. The B&K realtors are excited to now cover the entire East Coast and are working to prepare their southern agents to expand their reach to the Northeast.
B&K has hired your company to analyze the Northeast home listing prices in order to give information to their agents about the mean listing price at 95% confidence. Your company offers three analysis packages: one based on a sample size of 100 listings, one based on 1,000 listings, and another based on a sample size of 4,000 listings. Because there is an additional cost for data collection, your company charges more for the package with 4,000 listings than for the package with 100 listings.
Bronze Package – Sample size of 100 listings:
- 95% confidence interval for the mean of the Northeast house listing price has a margin of error of $24,500
- Cost for service to B&K: $2,000
Silver Package – Sample size of 1,000 listings:
- 95% confidence interval for the mean of the Northeast house listing price has a margin of error of $7,750
- Cost for service to B&K: $10,000
Gold Package – Sample size of 4,000 listings:
- 95% confidence interval for the mean of the Northeast house listing price has a margin of error of $3,900
- Cost for service to B&K: $25,000
The B&K management team does not understand the tradeoff between confidence level, sample size, and margin of error. B&K would like you to come back with your recommendation of the sample size that would provide the sales agents with the best understanding of northeast home prices at the lowest cost for service to B&K.
In other words, which option is preferable?
- Spending more on data collection and having a smaller margin of error
- Spending less on data collection and having a larger margin of error
- Choosing an option somewhere in the middle
For your initial post:
- Formulate a recommendation and write a confidence statement in the context of this scenario. For the purposes of writing your confidence statement, assume the sample mean house listing price is $310,000 for all packages. “I am [#] % confident the true mean . . . [in context].”
- Explain the factors that went into your recommendation, including a discussion of the margin of error
-
Statistics 2
Requirements: As long as it needs to be to completely answer the question correctly
-
Assignment 5
Answer the questions using the attached excel spreadsheet and supporting material. provide step-by-step instruction and explanation in a separate document on how you got your answer and include all formulations if using Excel must provide AI report and Turnitin/ithenticate report.Attached Files (PDF/DOCX): Week5-Probability-1.pdf
Note: Content extraction from these files is restricted, please review them manually.
-
Statistics Question
In a single Word document, write a one-page paper that describes real-life applications of the normal distribution.
Essential Activities:
- Reading Chapter 6: The Normal Distribution (Section 6.1 – Section 6.2) will assist you in writing your paper.
Notes:
- This paper must be formatted in APA Style 7th edition.
- Please refer to the written assignment rubric on the course information tab for this paper.
- Similarity not to exceed 20%
Requirements: n/a