- Home
- Free Samples
- Statistics
- BUS105 Use Of Descriptive Sample Stat...

# BUS105 Use of Descriptive Sample Statistics to Investigate Variable: Computing Assignment Answer

__“Instructions for the computing assignment worth 20% of your final grade __

Overview

__Materials that must be used in the assignment, these are provided on moodle __

*A pair of datasets for country 1 and country 2. Looking at the datasets you will notice that each student has been given a sample, students must use their own sample.

*An automatic dataset summarizer.

*Instructions for checking that you have properly found your sample, students must use their sample.

__ Use the following as the cover page for the word file __

“Title: semester 1, 2020 BUS105 computing assignment”

“Name:”

“Student number:”

“Sample: ”

__Overview __

You need to submit a word file with the answers to 9 questions the first 8 are about the dataset the last question is a paraphrasing task (refer to pages 3 to 6)

You will use your dataset and the automatic dataset summarizer to get the descriptive statistics that are used questions 1 to 5 and the inferential statistics that are used in question 6 to 8.

to check you have correctly obtained your dataset check both p-values are correct when you investigate both categorical variables (question 6)

The word count can be less than 1500 words if you are giving answers that demonstrate you have understood the material__.__

__Summary of the dataset (question 1 to 8 given on pages 3 to 6 are about the dataset) __

Suppose market research company XYZ did a survey in two different countries. The survey was designed to gather basic information about some customers and their opinion about TV model XYZ

The survey questions were

“What is your Income?”

“What is your gender?”

“How much are you willing to spend on a TV?”

“Would you buy TV model XYZ?”

So there are two datasets, one for each country

One dataset is the survey answers country 1

One dataset is the survey answers country 2

students MUST use the datasets they are given, They CANNOT use datasets they make themselves or take from other sources.

Each of the datasets consists of the following variables,

income? : a quantitative variable

Gender?: a categorical variable

Amount you would spend? : A quantitative variable, the amount they would spend on a TV

Would you buy?: A categorical variable, would they buy TV model XYZ

__paste the following cover page and the answers to questions 1 to 9 below into a word document __

“Title: semester 1, 2020 BUS105 computing assignment”

“Name:”

“Student number:”

“Sample: ”

“(students must use the dataset provided each student has been allocated their own sample)”

__Question 1 __a) Just using the information for Country 1

i) Paste in descriptive sample statistics that let you investigate the relationship between the variables “Gender?” and “Would you buy?” using the sample

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (choose one)

Difference between sample means -

Difference between sample proportions -

correlation coefficient *r*

b) Just using the information for Country 2

i) Paste in descriptive sample statistics that let you investigate the claim there is a relationship between the variables “Gender?” and “Would you buy?”

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (chose one)

Difference between sample means -

Difference between sample proportions -

correlation coefficient *r*

c) Compare the results in parts (a) and (b)

__Question 2__

a) Just using the information for country 1

i) Paste in descriptive sample statistics that let you investigate the relationship between the variables “Would you buy?” and “amount you would spend?” using the sample

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (chose one)

Difference between sample means -

Difference between sample proportions -

correlation coefficient *r*

b) Just using the information for Country 2

i) Paste in descriptive sample statistics that let you investigate the claim there is a relationship between the variables “would you buy?” and “amount you would spend?”

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics

Difference between sample means -

Difference between sample proportions -

correlation coefficient *r*

c) Compare the results in parts (a) and (b)

__Question 3__

a) Just using the information for Country 1

i) Paste in descriptive sample statistics and a graph that let you investigate the relationship between the variables “Income?” and “Amount you would spend?” using the sample

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics

Difference between sample means -

Difference between sample proportions -

correlation coefficient *r*

b) Just using the information for Country 2

i) Paste in descriptive sample statistics and a graph that let you investigate the claim there is a relationship between the variables “Income?” and “Amount you would spend?” using the sample

ii) Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics

Difference between sample means -

Difference between sample proportions -

correlation coefficient *r*

c) Compare the results in parts (a) and (b)

__Question 4__

__Question 5__

Just using the country 1 data set, more specifically the “variables income” and “would they buy” of the country 1 dataset

Hint: this is easy just the dataset summarizer

ii) find a 95% confidence interval for income

Hint: this is easy just the dataset summarizer

ii) find a 95% confidence interval for income

__Question 6__

a) Just using the information for country 1

i) Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables “Gender?” and “Would you buy?” if you consider the whole population

ii) Make suitable comments about the output in part (i)

b) Just using the information for country 2

i) Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables “Gender?” and “Would you buy?” if you consider the whole population

ii) Make suitable comments about the output in part (i)

c) Compare the results in parts (a) and (b)

__Question 7__

a) Just using the information for country 1

i) Paste in computer output that measure evidence for the claim there is a relationship between the variables “would you buy?” and “amount you would spend?” if you consider the whole population

Hint: inferential statistics measure evidence for a claim.

ii) Make suitable comments about the output in part (i)

b) Just using the information for country 2

i) Paste in computer output that measures evidence for the claim there is a relationship between the variables “would you buy?” and “amount you would spend?” if you consider the whole population

ii) Make suitable comments about the output in part (i)

c) Compare the results in parts (a) and (b)

__Question 8__

a) Just using the information for country 1

i) Paste in computer output that measures evidence for the claim there is a relationship between the variables “Income?” and “amount you would spend?” if you consider the whole population

Hint: inferential statistics measure evidence for a claim.

ii) Make suitable comments about the output in part (i)

b) Just using the information for country 2

i) Paste in computer output that measures evidence for the claim there is a relationship between the variables “Income?” and “amount you would spend?” if you consider the whole population

ii) Make suitable comments about the output in part (i)

c) Compare the results in parts (a) and (b)

__Question 9 __

You need to pick ONE of the following options for question 9, Your answer should be about 300 words long

**OPTION 1 **Give a brief summary of at least on the major ideas in the following power point available from

app.box.com/s/0eddpa7wut2kdae4nmkynrz0fgfdcrbg **OPTON 2 **

Briefly discuss the main message in the following sample report available from

app.box.com/s/ael5pciel84wnveu3z4bd74ro0v2djgn

__Instructions for the excel file , __

__This is worth 2% of your final grade you have to use the excel commands discussed below and not the dataset summarizerHowever you should check that your summaries are the same as the output from the dataset summarizer you used in the word file. If you have different information you will get at most 1 out of 2 __

You need to cut and paste just your dataset into a new excel file and follow the 4 instructions below, DO NOT use a cover page for the excel file, you must check that you have the correct sample

Note that you can still do this at home even if you do not have excel, just use google sheets

- For country 1 Use excel PivotTable commands (or google sheet pivot table commands) to find appropriate sample statistics that let you investigate the relationship between the fields (variables) “Gender?” and “Would you buy?”
- For country 1

Use excel PivotTable commands (or google sheet pivot table commands) to find appropriate sample statistics that let you investigate the relationship between the fields (variables) “Would you buy?” and “Amount you would spend?” - For country 1

Use excel commands to make a graph that lets you investigate the relationship between the fields (variables) “Income?” and “Amount you would spend?”

Upload the excel file with the pivot tables and scatterplot to the assignment dropbox

## Answer

**Title: BUS105 Computing Assignment**

**Semester 1, 2020 **

__Question 1__

**a) Just using the information for Country 1**

**i)** Paste in descriptive sample statistics that let you investigate the relationship between the variables “Gender?” and “Would you buy?” using the sample

descriptive sample statistics | |||

N | Y | total | |

female count | 11 | 35 | 46 |

female % | 23.91% | 76.09% | 100.00% |

male count | 16 | 38 | 54 |

male % | 29.63% | 70.37% | 100.00% |

**ii)** Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics.

The two variables are ‘Gender’ and ‘Would you buy’ and both are categorical in nature. The above descriptive sample table presents absolute numbers and percentage-wise break-up between the two variables. Hence, 11 or 23.91% of the female population is not willing to buy while 35 or 76.09% of the female population is willing to buy. Similarly, 16 or 29.63% of the male population is not willing to buy while 38 or 70.37% of the male population is willing to buy.

Hence, it is visible that majority of population, be it male or female is willing to buy. Overall, 27 respondents are not willing to buy while 73 respondents are willing to buy. This will used to find difference between proportions:

Difference between sample proportions -

- = 0.73 – 0.27 = 0.46.

The two proportions used are those who are willing to buy and those who are not willing to buy. The difference between the two proportions is 0.46 or 46% upswing for those willing to buy.

**b) Just using the information for Country 2**

**i)** Paste in descriptive sample statistics that let you investigate the claim there is a relationship between the variables “Gender?” and “Would you buy?”

descriptive sample statistics | |||

N | Y | total | |

female count | 18 | 28 | 46 |

female % | 39.13% | 60.87% | 100.00% |

male count | 33 | 21 | 54 |

male % | 61.11% | 38.89% | 100.00% |

**ii)** Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (chose one)

The two variables are ‘Gender’ and ‘Would you buy’ and both are categorical in nature. The above descriptive sample table presents absolute numbers and percentage-wise break-up between the two variables. Hence, 18 or 39.13% of the female population is not willing to buy while 28 or 60.87% of the female population is willing to buy. Similarly, 33 or 61.11% of the male population is not willing to buy while 21 or 38.89% of the male population is willing to buy.

Hence, it is visible that majority of female population is willing to buy but not majority of male population. Overall, 51 respondents are not willing to buy while 49 respondents are willing to buy. This will used to find difference between proportions:

Difference between sample proportions -

- = 0.51 – 0.49 = 0.02.

The two proportions used are those who are willing to buy and those who are not willing to buy. The difference between the two proportions is 0.02 or 2% upswing for those willing to buy.

**c) Compare the results in parts (a) and (b) **

The upswing for those willing to buy is 46% in case of country 1 and 2% in case of country 2. This indicates much stronger willingness to buy the TV model XYZ in Country 1. This is visible in following stacked bar graphs as well:

__Question 2__

**a) Just using the information for country 1**

**i)** Paste in descriptive sample statistics that let you investigate the relationship between the variables “Would you buy?” and “amount you would spend?” using the sample

descriptive sample statistics | |||||

xbar1 | xbar2 | s1 | s2 | n1 | n2 |

544.704 | 709.356 | 175.324 | 170 | 27 | 73 |

**ii)** Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics (chose one)

The two variables are ‘Would you buy’ and ‘Amount you would spend’ where one variable is categorical in nature and the other is quantitative in nature. The above descriptive sample table presents mean, standard deviation and count for the two categories (willing to buy and not willing to buy).

Difference between sample means -

- = 544.704-709.356 = -164.65

Hence, there is an upward swing of 164.65 for those willing to buy.

**b) Just using the information for Country 2**

**i)** Paste in descriptive sample statistics that let you investigate the claim there is a relationship between the variables “would you buy?” and “amount you would spend?”

descriptive sample statistics | |||||

xbar1 | xbar2 | s1 | s2 | n1 | n2 |

707.569 | 607.898 | 154.455 | 188 | 51 | 49 |

**ii)** Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics

The two variables are ‘Would you buy’ and ‘Amount you would spend’ where one variable is categorical in nature and the other is quantitative in nature. The above descriptive sample table presents mean, standard deviation and count for the two categories (willing to buy and not willing to buy).

Difference between sample means -

- = 707.569 - 607.898 = 99.67

Hence, there is an upward swing of 99.67 for those not willing to buy.

**c) Compare the results in parts (a) and (b) **

In case of Country 1, the mean amount that people are willing to spend indicates upswing for those willing to buy amounting to 164.65. In case of Country 2, the mean amount that people are willing to spend indicates downswing for those willing to buy amounting to 99.67

__Question 3__

**a) Just using the information for Country 1**

**i)** Paste in descriptive sample statistics and a graph that let you investigate the relationship between the variables “Income?” and “Amount you would spend?” using the sample

descriptive sample statistics | |

sample size | 100 |

sample Slope | 9.7647849 |

sample intercept | 25.794827 |

sample correlation r | 0.9467401 |

**ii)** Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics

The two variables are ‘Income’ and ‘Amount you would spend’ where both variables are quantitative in nature. The above descriptive sample table presents information for correlation and also coefficients for regression equation where sample slope is coefficient for x variable and sample intercept is the intercept value. The r is 0.95 which indicates high degree of positive correlationship between the two variables. As one variable increases, the other also increases and vice versa.

**b) Just using the information for Country 2**

**i) **Paste in descriptive sample statistics and a graph that let you investigate the claim there is a relationship between the variables “Income?” and “Amount you would spend?” using the sample

descriptive sample statistics | |

sample size | 100 |

sample Slope | 7.9331779 |

sample intercept | 4.7981417 |

sample correlation r | 0.9846244 |

**ii)** Use the output in part (i) to describe the relationship between the two variables, your discussion must use one of the following sample statistics

The two variables are ‘Income’ and ‘Amount you would spend’ where both variables are quantitative in nature. The above descriptive sample table presents information for correlation and also coefficients for regression equation where sample slope is coefficient for x variable and sample intercept is the intercept value. The r is 0.985 which indicates very high degree of positive correlationship between the two variables. As one variable increases, the other also increases and vice versa.

**c) Compare the results in parts (a) and (b) **

The value of r is positive for both countries indicate that the two variables, ‘income’ and ‘amount they would spend’ are positively correlated. As value of one variable increases, the value of other also increases and vice versa. This is seen in following scatterplots also where the observations are highly concentrated in upward direction.

__Question 4__

**Considering all people in the country 1 dataset**- What is sample size
*n:*__the sample size is n=100__ - What is the sample proportion of people that would buy the model XYZ TV:
__the sample proportion of people willing to buy the TV is 0.73__ - Use the answers in part (i) and (ii) to find the z-score of the sample proportion if you assume the population proportion
*p*=0.5:

X is a binomial variable (people will either buy or not buy TV) with:

Mean = µ = np = 100*0.5 = 50

SD = σ = √np(1-p) = √100*.5*(1-0.5) = √50*.5 = √25 = 5

Sample proportion () = 0.73

**z-score = (**-p)/ σ = (0.73-0.50)/5 = 0.046

**Considering all people in the country 2 dataset**- What is sample size
*n:*__the sample size is n=100__ - What is the sample proportion of people that would buy the model XYZ TV:
__the sample proportion of people willing to buy the TV is 0.49__ - Use the answers to parts (i) and (ii) to find zscore of the sample proportion if you assume the population proportion
*p*=0.5:

X is a binomial variable (people will either buy or not buy TV) with:

Mean = µ = np = 100*0.5 = 50

SD = σ = √np(1-p) = √100*.5*(1-0.5) = √50*.5 = √25 = 5

Sample proportion () = 0.49

**z-score = (**-p)/ σ = (0.49-0.50)/5 = -0.002

__Question 5__

**Just using the country 1 data set, more specifically the “variables income” and “would they buy” of the country 1 dataset **

**Just considering the people that would buy the TV**- What is sample size, sample mean and sample standard deviation of income

Using pivot table, the information is as below in highlighted row:

Row Labels | Count of income | Average of income2 | StdDev of income3 |

N | 27 | 53.33 | 17.23 |

Y | 73 | 69.93 | 16.18 |

Grand Total | 100 | 65.45 | 17.98 |

- find a 95% confidence interval for income

Confidence interval for income =

Mean = 69.93, n = 73, SD = 16.18

SD/√n = 16.18/√73 = 1.8937, n-1 = 72, α = 0.05 (2tail)

CI (95%) = 69.93 ± (1.9935)(1.8937)

= 69.93 ± 3.78

**95% CI = 66.15 and 73.71**

**Just considering the people would not buy the TV**- What is sample size, sample mean and sample standard deviation of income

Using pivot table, the information is as below in highlighted row:

Row Labels | Count of income | Average of income2 | StdDev of income3 |

N | 27 | 53.33 | 17.23 |

Y | 73 | 69.93 | 16.18 |

Grand Total | 100 | 65.45 | 17.98 |

- find a 95% confidence interval for income

Confidence interval for income =

Mean = 53.33, n = 27, SD = 17.23

SD/√n = 17.23/√27 = 3.3159, n-1 = 26, α = 0.05 (2tail)

CI (95%) = 53.33 ± (2.0555)(3.3159)

= 53.33 ± 6.82

**95% CI = 46.51 and 60.1**

__Question 6__

**a) Just using the information for country 1**

**i)** Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables “Gender?” and “Would you buy?” if you consider the whole population

Inferential statistics | ||||

n1 | n2 | phat 1 | phat 2 | |

46 | 54 | 0.76087 | 0.703703704 | |

Estimate of the difference between population proportions | ||||

phat1-phat2 | ||||

0.057165862 | ||||

standard error of estimate | test stat | two sided pvalue | ||

0.089077397 | 0.641754964 | 0.521032295 | ||

To calculate the p-value H0:p1=p2 is assumed to be true | ||||

since the test is two sided H1 is H1:p1≠p2 |

The p-value is 0.52 which is higher than significance level of 0.05. Hence, the p-value is not significant and we do not have enough statistical evidence to reject the null hypothesis.

**ii)** Make suitable comments about the output in part (i)

At given level of sig of 0.05, it is concluded that there is no significant difference between the two proportions.

**b) Just using the information for country 2**

**i)** Paste in inferential statistics that measure evidence for the claim there is a relationship between the variables “Gender?” and “Would you buy?” if you consider the whole population

Inferential statistics | ||||

n1 | n2 | phat 1 | phat 2 | |

46 | 54 | 0.6087 | 0.388888889 | |

Estimate of the difference between population proportions | ||||

phat1-phat2 | ||||

0.219806763 | ||||

standard error of estimate | test stat | two sided pvalue | ||

0.100301478 | 2.191460862 | 0.028418459 | ||

To calculate the p-value H0:p1=p2 is assumed to be true | ||||

since the test is two sided H1 is H1:p1≠p2 |

The p-value is 0.03 which is lower than significance level of 0.05. Hence, the p-value is significant and we have enough statistical evidence to reject the null hypothesis.

**ii)** Make suitable comments about the output in part (i)

At given level of sig of 0.05, it is concluded that there is significant difference between the two proportions.

**c) Compare the results in parts (a) and (b) **

In case of country 1, no significant difference between proportions was found while in case of Country 2, significant difference between proportions was found.

__Question 7__

**a) Just using the information for country 1**

**i)** Paste in computer output that measure evidence for the claim there is a relationship between the variables “would you buy?” and “amount you would spend?” if you consider the whole population

Inferential statistics | |||

Estimate of the difference between population means | |||

xbar1-xbar2 | |||

-164.652 | |||

standard error of estimate xbar1-xbar2 | |||

39.1474 | |||

t test stat | df | two sided pvalue | |

-4.20596 | 45 | 0.00012 | |

To calculate the p-value H0:μ1=μ2 is assumed to be true | |||

since the test is two sided H1 is H1:μ1≠μ2 |

**ii)** Make suitable comments about the output in part (i)

The p-value is less than 0.05 so there is strong evidence that there is a difference between population means. We reject the null hypothesis.

We can see that the difference in mean amount of those who are willing to spend and not willing to spend is 164.65 and it is statistically significant as concluded above.

**b) Just using the information for country 2**

**i)** Paste in computer output that measures evidence for the claim there is a relationship between the variables “would you buy?” and “amount you would spend?” if you consider the whole population

Inferential statistics | |||

Estimate of the difference between population means | |||

xbar1-xbar2 | |||

99.6707 | |||

standard error of estimate xbar1-xbar2 | |||

34.4636 | |||

t test stat | df | two sided pvalue | |

2.89205 | 92 | 0.00478 | |

To calculate the p-value H0:μ1=μ2 is assumed to be true | |||

since the test is two sided H1 is H1:μ1≠μ2 |

**ii)** Make suitable comments about the output in part (i)

The p-value is less than 0.05 so there is strong evidence that there is a difference between population means. We reject the null hypothesis.

We can see that the difference in mean amount of those who are willing to spend and not willing to spend is 99.67 and it is statistically significant as concluded above.

**c) Compare the results in parts (a) and (b) **

Both the countries indicate statistically significant evidence for difference in mean amount that people are willing to spend (as categorised basis whether they are willing to buy or not). Country 1 indicates average which is higher by 164.65 for those willing to buy as compared to those not willing to buy. Country 2 indicates average which is lower by 99.67 for those willing to buy as compared to those not willing to buy.

__Question 8__

**a) Just using the information for country 1**

**i)** Paste in computer output that measures evidence for the claim there is a relationship between the variables “Income?” and “amount you would spend?” if you consider the whole population

Inferential statistics | |||

paste this into the word file and add comments | |||

correlation r | 0.9467401 | ||

R square | 0.8963169 | ||

standard error of slope | 0.3354848 | ||

test stat of slope | 29.106492 | ||

two sided p-value for slope | 4.991E-50 | ||

To calculate the p-value H0:population slope =0 is assumed to be true | |||

since the test is two sided H1 is H1:population slope ≠0 |

**ii)** Make suitable comments about the output in part (i)

The p-value is less than 0.05 so there is strong evidence that there is relationship between the two variables as also indicated by high value of R^{2} of 0.896.

**b) Just using the information for country 2**

**i)** Paste in computer output that measures evidence for the claim there is a relationship between the variables “Income?” and “amount you would spend?” if you consider the whole population

Inferential statistics | |||

paste this into the word file and add comments | |||

correlation r | 0.9846244 | ||

R square | 0.9694852 | ||

standard error of slope | 0.1421735 | ||

test stat of slope | 55.799267 | ||

two sided p-value for slope | 4.496E-76 | ||

To calculate the p-value H0:population slope =0 is assumed to be true | |||

since the test is two sided H1 is H1:population slope ≠0 |

**ii)** Make suitable comments about the output in part (i)

The p-value is less than 0.05 so there is strong evidence that there is relationship between the two variables as also indicated by high value of R^{2} of 0.969.

**c) Compare the results in parts (a) and (b) **

Both countries indicate very strong positive relationship between income and amount they are willing to spend as indicated by high value of R and R^{2}

__Question 9 __

OPTION 2

The given report is similar in nature to the one that we are currently doing. The report discusses case of two universities where statistics is being taught using an old method and a new method.

The variables analysed include attendance of the student, marks of the student and whether they passed or failed the course. Hence,

- Attendance which refers to number of classes attended is a quantitativevariable
- Mark which refers to the students mark is a quantitative variable
- Did they pass refers to the result and is a categorical variable with output being either pass or fail.
- Which method refers to the method of teaching and is a categorical variable with output being either old or new.

The report analyses descriptive sample statistics and inferential statistics between various variables.

- The analysis for two categorical variables, “which method” and “did they pass” uses difference in sample proportions method. P-value method is used under inferential technique.
- The analysis for two quantitative variables, “attendance” and “marks” uses scatterplot, correlation and regression techniques. P-value method is used under inferential technique.
- The analysis for one categorical variable, “which method” and one quantitative variable, “attendance” uses difference in sample means method.

The report also compares results for University 1 and University 2 to conclude the inferences based on use of various statistical methods.

Hence, the report is similar in using various techniques and explains how various types of variables can be analysed. The same technique cannot be used for all types of variables and hence, the method varies.

##### Customer Testimonials

**Sydney**

**Sydney**

**New South Wales**