My Assignment Help

BUS105 Use of Descriptive Sample Statistics to Investigate Variable: Computing Assignment Answer

“Instructions for the computing assignment worth 20% of your final grade 

Overview 

Materials that must be used in the assignment, these are provided on moodle 

*A pair of datasets for country 1 and country 2. Looking at the datasets you will notice that each student has been given a sample, students must use their own sample.
*An automatic dataset summarizer.
*Instructions for checking that you have properly found your sample, students must use their sample.

 Use the following as the cover page for the word file  

“Title: semester 1, 2020 BUS105 computing assignment”
“Name:”
“Student number:”
“Sample:   ”

Overview 

You need to submit a word file with the answers to 9 questions the first 8 are about the dataset the last question is a paraphrasing task (refer to pages 3 to 6)

You will use your dataset and the automatic dataset summarizer to get the descriptive statistics that are used questions 1 to 5 and the inferential statistics that are used in question 6 to 8.
to check you have correctly obtained your dataset check both p-values are correct when you investigate both categorical variables (question 6) 

 The word count can be less than 1500 words if you are giving answers that demonstrate you have understood the material.

Summary of the dataset (question 1 to 8 given on pages 3 to 6  are about the dataset) 

Suppose market research company XYZ did a survey in two different countries. The survey was designed to gather basic information about some customers and their opinion about TV model XYZ

The survey questions were

“What is your Income?”
“What is your gender?”
“How much are you willing to spend on a TV?”
“Would you buy TV model XYZ?” 

So there are two datasets, one for each country

One dataset is the survey answers country 1
One dataset is the survey answers country 2 

 students MUST use the datasets they are given, They CANNOT use datasets they make themselves or take from other sources.

Each of the datasets consists of the following variables, 

income? : a quantitative variable
Gender?: a categorical variable
Amount you would spend? : A quantitative variable, the amount they would spend on a TV
Would you buy?:  A categorical variable, would they buy TV model XYZ 

paste the following cover page and the answers to questions 1 to 9 below into a word document 

“Title: semester 1, 2020 BUS105 computing assignment”
“Name:”
“Student number:”
“Sample:   ”

“(students must use the dataset provided each student has been allocated their own sample)”

Question 1
a) Just using the information for Country 1

i) Paste in descriptive sample statistics that let you investigate the relationship between the variables “Gender?” and “Would you buy?” using the sample 

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (choose one)

Difference between sample means -
Difference between sample proportions  -
correlation coefficient r

b) Just using the information for Country 2

i) Paste in descriptive sample statistics that let you investigate the claim there is a relationship between the variables “Gender?” and “Would you buy?”

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (chose one)  

Difference between sample means -
Difference between sample proportions  -
correlation coefficient r

c) Compare the results in parts (a) and (b)

Question 2

a) Just using the information for country 1

i) Paste in descriptive sample statistics that let you investigate the relationship between the variables “Would you buy?” and “amount you would spend?” using the sample

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (chose one) 

Difference between sample means -
Difference between sample proportions  -
correlation coefficient r

b) Just using the information for Country 2

i) Paste in descriptive sample statistics that let you investigate the claim there is a relationship between the variables “would you buy?” and “amount you would spend?”

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics 

Difference between sample means -
Difference between sample proportions  -
correlation coefficient r

c) Compare the results in parts (a) and (b)

Question 3

a) Just using the information for Country 1

i) Paste in descriptive sample statistics and a graph that let you investigate the relationship between the variables “Income?” and “Amount you would spend?” using the sample 

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics 

Difference between sample means -
Difference between sample proportions  -
correlation coefficient r

b) Just using the information for Country 2

i) Paste in descriptive sample statistics and a graph that let you investigate the claim there is a relationship between the variables “Income?” and “Amount you would spend?” using the sample

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics 

Difference between sample means -
Difference between sample proportions  -
correlation coefficient r

c) Compare the results in parts (a) and (b) 

Question 4

Question 5

Just using the country 1 data set, more specifically the “variables income” and “would they buy” of the country 1 dataset 

Hint: this is easy just the dataset summarizer 

ii) find a 95% confidence interval for income

Hint: this is easy just the dataset summarizer 

ii) find a 95% confidence interval for income

Question 6

a) Just using the information for country 1

i) Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables “Gender?” and “Would you buy?” if you consider the whole population 

ii) Make suitable comments about the output in part (i)

b) Just using the information for country 2

i) Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables “Gender?” and “Would you buy?” if you consider the whole population 

ii) Make suitable comments about the output in part (i)

c) Compare the results in parts (a) and (b)

Question 7

a) Just using the information for country 1

i) Paste in computer output that measure evidence for the claim there is a relationship between the variables “would you buy?” and “amount you would spend?” if you consider the whole population 

Hint: inferential statistics measure evidence for a claim.

ii) Make suitable comments about the output in part (i)

b) Just using the information for country 2

i) Paste in computer output that measures evidence for the claim there is a relationship between the variables “would you buy?” and “amount you would spend?” if you consider the whole population 

ii) Make suitable comments about the output in part (i)

c) Compare the results in parts (a) and (b)

Question 8

a) Just using the information for country 1

i) Paste in computer output that measures evidence for the claim there is a relationship between the variables “Income?” and “amount you would spend?” if you consider the whole population
Hint: inferential statistics measure evidence for a claim.

ii) Make suitable comments about the output in part (i)

b) Just using the information for country 2

i) Paste in computer output that measures evidence for the claim there is a relationship between the variables “Income?” and “amount you would spend?” if you consider the whole population 

ii) Make suitable comments about the output in part (i)

c) Compare the results in parts (a) and (b)

Question 9 

You need to pick ONE of the following options for question 9, Your answer should be about 300 words long

OPTION 1
Give a brief summary of at least on the major ideas in the following power point  available from

app.box.com/s/0eddpa7wut2kdae4nmkynrz0fgfdcrbg 

OPTON 2 

 Briefly discuss the main message in the following sample report available from

app.box.com/s/ael5pciel84wnveu3z4bd74ro0v2djgn 

Instructions for the excel file , 

This is worth 2% of your final grade
you have to use the excel commands discussed below and not the dataset summarizer
However you should check that your summaries are the same as the output from the dataset summarizer you used in the word file.
 If you have different information you will get at most 1 out of 2 

You need to cut and paste just your dataset into a new excel file and follow the 4 instructions below, DO NOT use a cover page for the excel file, you must check that you have the correct sample

Note that you can still do this at home even if you do not have excel, just use google sheets 

  1. For country 1  Use excel PivotTable commands (or google sheet pivot table commands)  to find appropriate sample statistics that let you investigate  the relationship between the fields (variables) “Gender?” and “Would you buy?”
  2. For country 1
    Use excel PivotTable commands (or google sheet pivot table commands) to find appropriate sample statistics that let you investigate the relationship between the fields (variables) “Would you buy?” and “Amount you would spend?”
  3. For country 1
    Use excel commands to make a graph that lets you investigate the relationship between the fields (variables) “Income?” and “Amount you would spend?”

Upload the excel file with the pivot tables and scatterplot to the assignment dropbox 

Answer

Title: BUS105 Computing Assignment

Semester 1, 2020 

Question 1

a) Just using the information for Country 1

i) Paste in descriptive sample statistics that let you investigate the relationship between the variables “Gender?” and “Would you buy?” using the sample 

descriptive sample statistics   
 NYtotal
female count113546
female %23.91%76.09%100.00%
male count163854
male %29.63%70.37%100.00%

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics.

The two variables are ‘Gender’ and ‘Would you buy’ and both are categorical in nature. The above descriptive sample table presents absolute numbers and percentage-wise break-up between the two variables. Hence, 11 or 23.91% of the female population is not willing to buy while 35 or 76.09% of the female population is willing to buy. Similarly, 16 or 29.63% of the male population is not willing to buy while 38 or 70.37% of the male population is willing to buy. 

Hence, it is visible that majority of population, be it male or female is willing to buy. Overall, 27 respondents are not willing to buy while 73 respondents are willing to buy. This will used to find difference between proportions:

Difference between sample proportions  -  

 - = 0.73 – 0.27 = 0.46. 

The two proportions used are those who are willing to buy and those who are not willing to buy. The difference between the two proportions is 0.46 or 46% upswing for those willing to buy.

b) Just using the information for Country 2

i) Paste in descriptive sample statistics that let you investigate the claim there is a relationship between the variables “Gender?” and “Would you buy?”

descriptive sample statistics   
 NYtotal
female count182846
female %39.13%60.87%100.00%
male count332154
male %61.11%38.89%100.00%

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (chose one)  

The two variables are ‘Gender’ and ‘Would you buy’ and both are categorical in nature. The above descriptive sample table presents absolute numbers and percentage-wise break-up between the two variables. Hence, 18 or 39.13% of the female population is not willing to buy while 28 or 60.87% of the female population is willing to buy. Similarly, 33 or 61.11% of the male population is not willing to buy while 21 or 38.89% of the male population is willing to buy. 

Hence, it is visible that majority of female population is willing to buy but not majority of male population. Overall, 51 respondents are not willing to buy while 49 respondents are willing to buy. This will used to find difference between proportions:

Difference between sample proportions  -  

 - = 0.51 – 0.49 = 0.02. 

The two proportions used are those who are willing to buy and those who are not willing to buy. The difference between the two proportions is 0.02 or 2% upswing for those willing to buy.

c) Compare the results in parts (a) and (b)  

The upswing for those willing to buy is 46% in case of country 1 and 2% in case of country 2. This indicates much stronger willingness to buy the TV model XYZ in Country 1. This is visible in following stacked bar graphs as well:


descriptive sample statistics 1descriptive sample statistics 2

Question 2

a) Just using the information for country 1

i) Paste in descriptive sample statistics that let you investigate the relationship between the variables “Would you buy?” and “amount you would spend?” using the sample

descriptive sample statistics    
xbar1xbar2s1s2n1n2
544.704709.356175.3241702773

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (chose one) 

The two variables are ‘Would you buy’ and ‘Amount you would spend’ where one variable is categorical in nature and the other is quantitative in nature. The above descriptive sample table presents mean, standard deviation and count for the two categories (willing to buy and not willing to buy).

Difference between sample means -

 - = 544.704-709.356 = -164.65

Hence, there is an upward swing of 164.65 for those willing to buy.

b) Just using the information for Country 2

i) Paste in descriptive sample statistics that let you investigate the claim there is a relationship between the variables “would you buy?” and “amount you would spend?”

descriptive sample statistics    
xbar1xbar2s1s2n1n2
707.569607.898154.4551885149

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics 

The two variables are ‘Would you buy’ and ‘Amount you would spend’ where one variable is categorical in nature and the other is quantitative in nature. The above descriptive sample table presents mean, standard deviation and count for the two categories (willing to buy and not willing to buy).

Difference between sample means -

 - = 707.569 - 607.898 = 99.67

Hence, there is an upward swing of 99.67 for those not willing to buy.

c) Compare the results in parts (a) and (b)  

In case of Country 1, the mean amount that people are willing to spend indicates upswing for those willing to buy amounting to 164.65. In case of Country 2, the mean amount that people are willing to spend indicates downswing for those willing to buy amounting to 99.67

Question 3

a) Just using the information for Country 1

i) Paste in descriptive sample statistics and a graph that let you investigate the relationship between the variables “Income?” and “Amount you would spend?” using the sample 

descriptive sample statistics
sample size100
sample Slope9.7647849
sample intercept25.794827
sample correlation r 0.9467401

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics 

The two variables are ‘Income’ and ‘Amount you would spend’ where both variables are quantitative in nature. The above descriptive sample table presents information for correlation and also coefficients for regression equation where sample slope is coefficient for x variable and sample intercept is the intercept value. The r is 0.95 which indicates high degree of positive correlationship between the two variables. As one variable increases, the other also increases and vice versa.

b) Just using the information for Country 2

i) Paste in descriptive sample statistics and a graph that let you investigate the claim there is a relationship between the variables “Income?” and “Amount you would spend?” using the sample

descriptive sample statistics
sample size100
sample Slope7.9331779
sample intercept4.7981417
sample correlation r 0.9846244

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics 

The two variables are ‘Income’ and ‘Amount you would spend’ where both variables are quantitative in nature. The above descriptive sample table presents information for correlation and also coefficients for regression equation where sample slope is coefficient for x variable and sample intercept is the intercept value. The r is 0.985 which indicates very high degree of positive correlationship between the two variables. As one variable increases, the other also increases and vice versa.

c) Compare the results in parts (a) and (b) 

The value of r is positive for both countries indicate that the two variables, ‘income’ and ‘amount they would spend’ are positively correlated. As value of one variable increases, the value of other also increases and vice versa. This is seen in following scatterplots also where the observations are highly concentrated in upward direction.

 Scatterplots Descriptive Sample Statistics Scatterplots Descriptive Sample Statistics 2


Question 4

  1. Considering all people in the country 1 dataset 
  2. What is sample size n:the sample size is n=100 
  3. What is the sample proportion of people that would buy the model XYZ TV: the sample proportion of people willing to buy the TV is 0.73  
  4. Use the answers in part (i) and (ii) to find the z-score of the sample proportion if you assume the population proportion p=0.5:

X is a binomial variable (people will either buy or not buy TV) with:

Mean = µ = np = 100*0.5 = 50

SD = σ = √np(1-p) = √100*.5*(1-0.5) = √50*.5 = √25 = 5

Sample proportion () = 0.73

z-score = (-p)/ σ  = (0.73-0.50)/5 = 0.046

  1. Considering all people in the country 2 dataset 
  2. What is sample size n:the sample size is n=100
  3. What is the sample proportion of people that would buy the model XYZ TV: the sample proportion of people willing to buy the TV is 0.49  
  4. Use the answers to parts (i) and (ii) to find zscore of the sample proportion if you assume the population proportion p=0.5:

X is a binomial variable (people will either buy or not buy TV) with:

Mean = µ = np = 100*0.5 = 50

SD = σ = √np(1-p) = √100*.5*(1-0.5) = √50*.5 = √25 = 5

Sample proportion () = 0.49

z-score = (-p)/ σ  = (0.49-0.50)/5 = -0.002

Question 5

Just using the country 1 data set, more specifically the “variables income” and “would they buy” of the country 1 dataset 

  1. Just considering the people that would buy the TV
  2. What is sample size, sample mean and sample standard deviation of income 

Using pivot table, the information is as below in highlighted row:

Row LabelsCount of incomeAverage of income2StdDev of income3
N2753.33 17.23 
Y73                              69.93                             16.18 
Grand Total10065.45 17.98 
  1. find a 95% confidence interval for income 

Confidence interval for income = 

Mean = 69.93, n = 73, SD = 16.18

SD/√n = 16.18/√73 = 1.8937, n-1 = 72, α = 0.05 (2tail)

CI (95%) = 69.93 ± (1.9935)(1.8937) 

= 69.93 ± 3.78

95% CI = 66.15 and 73.71

  1. Just considering the people would not buy the TV
  2. What is sample size, sample mean and sample standard deviation of income 

Using pivot table, the information is as below in highlighted row:

Row LabelsCount of incomeAverage of income2StdDev of income3
N27                              53.33                             17.23 
Y7369.93 16.18 
Grand Total10065.45 17.98 
  1. find a 95% confidence interval for income 

Confidence interval for income = 

Mean = 53.33, n = 27, SD = 17.23

SD/√n = 17.23/√27 = 3.3159, n-1 = 26, α = 0.05 (2tail)

CI (95%) = 53.33 ± (2.0555)(3.3159) 

= 53.33 ± 6.82

95% CI = 46.51 and 60.1

Question 6

a) Just using the information for country 1

i) Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables “Gender?” and “Would you buy?” if you consider the whole population 

Inferential statistics   
n1n2 phat 1phat 2
4654 0.760870.703703704
Estimate of the difference between   population proportions 
phat1-phat2    
0.057165862    
standard error of estimate test stat two sided pvalue 
0.089077397 0.641754964 0.521032295
To calculate the p-value H0:p1=p2 is   assumed to be true 
since the test is two sided H1 is   H1:p1≠p2  

The p-value is 0.52 which is higher than significance level of 0.05. Hence, the p-value is not significant and we do not have enough statistical evidence to reject the null hypothesis.

ii) Make suitable comments about the output in part (i)

At given level of sig of 0.05, it is concluded that there is no significant difference between the two proportions.

b) Just using the information for country 2

i) Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables “Gender?” and “Would you buy?” if you consider the whole population 

Inferential statistics   
n1n2 phat 1phat 2
4654 0.60870.388888889
Estimate of the difference between population proportions 
phat1-phat2    
0.219806763    
standard error of estimate test stat two sided pvalue 
0.100301478 2.191460862 0.028418459
To calculate the p-value H0:p1=p2 is assumed to be true 
since the test is two sided H1 is H1:p1≠p2  

The p-value is 0.03 which is lower than significance level of 0.05. Hence, the p-value is significant and we have enough statistical evidence to reject the null hypothesis.

ii) Make suitable comments about the output in part (i)

At given level of sig of 0.05, it is concluded that there is significant difference between the two proportions.

c) Compare the results in parts (a) and (b)  

In case of country 1, no significant difference between proportions was found while in case of Country 2, significant difference between proportions was found.

Question 7

a) Just using the information for country 1

i) Paste in computer output that measure evidence for the claim there is a relationship between the variables “would you buy?” and “amount you would spend?” if you consider the whole population 

Inferential statistics   
Estimate of the difference between   population means
xbar1-xbar2  
-164.652   
standard error of estimate  xbar1-xbar2
39.1474   
t test stat dftwo sided pvalue 
-4.20596450.00012 
To calculate the p-value H0:μ1=μ2 is   assumed to be true
since the test is two sided H1 is   H1:μ1≠μ2

ii) Make suitable comments about the output in part (i)

The p-value is less than 0.05 so there is strong evidence that there is a difference between population means. We reject the null hypothesis. 

We can see that the difference in mean amount of those who are willing to spend and not willing to spend is 164.65 and it is statistically significant as concluded above.

b) Just using the information for country 2

i) Paste in computer output that measures evidence for the claim there is a relationship between the variables “would you buy?” and “amount you would spend?” if you consider the whole population 

Inferential statistics   
Estimate of the difference between   population means
xbar1-xbar2  
99.6707   
standard error of estimate  xbar1-xbar2
34.4636   
t test stat dftwo sided pvalue 
2.89205920.00478 
To calculate the p-value H0:μ1=μ2 is   assumed to be true
since the test is two sided H1 is   H1:μ1≠μ2

ii) Make suitable comments about the output in part (i)

The p-value is less than 0.05 so there is strong evidence that there is a difference between population means. We reject the null hypothesis. 

We can see that the difference in mean amount of those who are willing to spend and not willing to spend is 99.67 and it is statistically significant as concluded above.

c) Compare the results in parts (a) and (b)  

Both the countries indicate statistically significant evidence for difference in mean amount that people are willing to spend (as categorised basis whether they are willing to buy or not). Country 1 indicates average which is higher by 164.65 for those willing to buy as compared to those not willing to buy. Country 2 indicates average which is lower by 99.67 for those willing to buy as compared to those not willing to buy.

Question 8

a) Just using the information for country 1

i) Paste in computer output that measures evidence for the claim there is a relationship between the variables “Income?” and “amount you would spend?” if you consider the whole population 

Inferential statistics    
 paste this into the word file and add   comments 
correlation r 0.9467401  
R square0.8963169  
standard error of slope0.3354848  
test stat of slope29.106492  
two sided p-value for slope4.991E-50  
To calculate the p-value   H0:population slope =0  is assumed to   be true
since the test is two sided H1 is   H1:population slope ≠0

ii) Make suitable comments about the output in part (i)

The p-value is less than 0.05 so there is strong evidence that there is relationship between the two variables as also indicated by high value of R2 of 0.896.

b) Just using the information for country 2

i) Paste in computer output that measures evidence for the claim there is a relationship between the variables “Income?” and “amount you would spend?” if you consider the whole population 

Inferential statistics    
 paste this into the word file and add   comments 
correlation r 0.9846244  
R square0.9694852  
standard error of slope0.1421735  
test stat of slope55.799267  
two sided p-value for slope4.496E-76  
To calculate the p-value H0:population   slope =0  is assumed to be true
since the test is two sided H1 is   H1:population slope ≠0

ii) Make suitable comments about the output in part (i)

The p-value is less than 0.05 so there is strong evidence that there is relationship between the two variables as also indicated by high value of R2 of 0.969.

c) Compare the results in parts (a) and (b)  

Both countries indicate very strong positive relationship between income and amount they are willing to spend as indicated by high value of R and R2 

Question 9 

OPTION 2

The given report is similar in nature to the one that we are currently doing. The report discusses case of two universities where statistics is being taught using an old method and a new method. 

The variables analysed include attendance of the student, marks of the student and whether they passed or failed the course. Hence, 

  1. Attendance which refers to number of classes attended is a quantitativevariable
  2. Mark which refers to the students mark is a quantitative variable
  3. Did they pass refers to the result and is a categorical variable with output being either pass or fail.
  4. Which method refers to the method of teaching and is a categorical variable with output being either old or new.

The report analyses descriptive sample statistics and inferential statistics between various variables.

  1. The analysis for two categorical variables, “which method” and “did they pass” uses difference in sample proportions method. P-value method is used under inferential technique.
  2. The analysis for two quantitative variables, “attendance” and “marks” uses scatterplot, correlation and regression techniques. P-value method is used under inferential technique.
  3. The analysis for one categorical variable, “which method” and one quantitative variable, “attendance” uses difference in sample means method.

The report also compares results for University 1 and University 2 to conclude the inferences based on use of various statistical methods.

Hence, the report is similar in using various techniques and explains how various types of variables can be analysed. The same technique cannot be used for all types of variables and hence, the method varies.

Customer Testimonials