Get Instant Help From 5000+ Experts For
question

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing:Proofread your work by experts and improve grade at Lowest cost

And Improve Your Grades
myassignmenthelp.com
loader
Phone no. Missing!

Enter phone no. to receive critical updates and urgent messages !

Attach file

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Guaranteed Higher Grade!
Free Quote
wave

Dataset 1

Prepare a report in a document file (.doc or .docx) which includes all relevant tables and figures, using the following structure:

1.Section 1: Introduction

a.Give a brief introduction about the assignment and search related article and write a paragraph of summary which supports your assignment. You need to give the full citation of the article.

b.Dataset 1: Give a short description about this dataset. Is this primary or secondary data? What are types of variables involved? Explain briefly what are the possible cases used in this study.

c.Dataset 2: Explain how you collect the data and discuss its limitation (e.g. whether your sample is biased). Is this primary or secondary data? What is/are the type(s) of variable(s) involved? Give a description of cases you consider for this data set.

2.Section 2: Analysis of single variable in Dataset 1

a.To answer research question “Which type of public transport was most used by the

NSW people during 8th to 14th of August 2016?”, provide a suitable numerical summary and graphical display for the variables mode of Dataset 1. Give a detailed comment to answer the research question.

b.Now to answer research question “Are there more than 50% of public transport users in NSW use the particular mode of transport found in Part a?” setup an appropriate hypotheses, perform hypotheses test and answer the research question by writing the conclusion of the test.

3.Section 3: Analysis of two variables in Dataset 1

NSW Government need to decide on whether they have to build an underground Railway line from either Parramatta, Bankstown or Gosford to central. To prepare a recommendation for this;

a.Give a numerical summary and an appropriate graphical display for the variables location, by only considering those three stations; and the variable count by considering the data with trains only.

b.Perform a suitable hypothesis test at a 5% level of significance to test whether there is difference between mean counts of taps on and off.

c.Use the conclusion of the test in part b and the outputs in part a to write a recommendation to NSW government.

4.Section 4: Collect and analysis Dataset2

You are interested in finding whether there is a difference in preference between different gender in terms of their transport mode (Bus, Train, Ferry and Light Rail). by considering appropriate number of cases and variable, give a proper graphical display and use it to write a comments.

Section 5: Discussion & Conclusion

Write an executive summary by combining all your findings in the previous sections which must be a valuable recommendation for NSW Transport. Give a suggestion for further research

Dataset 1

The aim of this assignment is to test skills of collecting and analyzing data to answer a specific business problem. The assignment also seeks to present an opportunity to apply the theories learnt during the course such as finding numerical summaries, displaying with appropriate graphs and using statistical inferences to solve business problems, including constructing hypotheses, test them and interpreting the findings (Ryabko, Stognienko, & Shokin, 2004).

We are presented with data for NSW transport system in order to come up with decision based recommendations that aims at improving public transport system. The project presents a series of research questions which need to be answered based on the knowledge gained in the course of the study.  

  1. Dataset 1:

The first dataset (dataset 1) is a secondary data provided by NSW transport system. The data has a total of 1000 observations with six variables. The description of the variables is given below;

Table 1: Description of the variables

Variable

Description

Values

Variable Type

mode

Type of the public transport

Bus, Train, Ferry and Light Rail

Nominal Variable (qualitative)

date

Date of the tap on/off held

Date/month/year

Nominal Variable (qualitative)

tap

It is a tap on or off

On and Off

Nominal Variable (qualitative)

loc

Locations of stops. For bus postcodes and others name of the stations

Postcodes and names of the stations

Nominal Variable (qualitative)

count

Total number tap on or off on the certain location and the certain date

Number

Scale variable (quantitative)

The possible cases used in this study are 1000 cases (number of observations).

  1. Dataset 2:

The second dataset (dataset 2) is a primary data provided that was collected by the researcher. A random sample of 50 individuals was selected and the persons interviewed in regard to their gender, age and the mode of transport they prefer to use most. The data has a total of 50 observations with three variables. Give a description of cases you consider for this data set.

For the dataset 2, a random sampling was employed to collect the data from individuals so as to understand the mode of transport they frequently use. This is a primary data since the data is collected directly from the subjects. The limitation of this data is the fact that only a small sample size of 50 cases was selected. The description of the variables is given below;

Table 2: Description of the variables

Variable

Description

Values

Variable Type

Mode

Type of the public transport

Bus, Train, Ferry and Light Rail

Nominal Variable (qualitative)

Age

Date of the tap on/off held

Number

Scale variable (quantitative)

Gender

Gender of the respondent

Male and female

Nominal Variable (qualitative)

  1. Section 2: Analysis of single variable in Dataset 1

In this section, we attempt to answer the research questions posed. To answer the research questions, we use dataset 1.

  1. Which type of public transport was most used by the NSW people during 8thto 14th of August 2016?

To answer this research question, we ran a frequency distribution test. Table 1 below gives the results.

Table 3: Frequency table for the mode of transport used

Row Labels

Count of mode

Percent

Bus

467

46.7%

Ferry

25

2.5%

Light-rail

24

2.4%

Train

484

48.4%

Grand Total

1000

100.0%

As can be seen, the top most used modes were use of bus and train. Train however came out as the most frequently used with 48.4% (n = 484) of the participants having used it in the last 1 week. The second most commonly used mode was the bus with 46.7% (n = 467) having used it in the last one week. Ferry and Light-rail were among the least used with only 2.4% (n = 24) having used light-rail in the last one week and 2.5% (n = 25) said to have used ferry in the last one week.

Dataset 2

Figure 1: Bar chart on mode of transport used

  1. Now to answer research question “whether the proportion of those using train is greater than 50%, the setup for an appropriate hypotheses is given below.

To answer the given research question, the following hypothesis was tested.

H0: The proportion of transport users who use train is not significantly different from 50%.

HA: The proportion of transport users who use train is significantly different from 50%.

To test this, a One-Sample t-test was used and it was tested at 5% level of significance. The results are given below;

Table 4: One-Sample Statistics

N

Mean

Std. Deviation

Std. Error Mean

Train

1000

.4840

.49999

.01581

Table 5: One-Sample Test

Test Value = 0.5

t

df

Sig. (2-tailed)

Mean Difference

95% Confidence Interval of the Difference

Lower

Upper

Train

-1.012

999

.312

-.01600

-.0470

.0150

A one-sample t-test was run to determine whether the proportion of NSW transport users who rely on train as the mode of transport is more than 50%. The proportion of those who used train transport (0.484 ± 0.5) was not significantly different from 50% (95% CI, -0.05 to 0.02), t(999) = -1.012, p = .312.

  1. Section 3: Analysis of two variables in Dataset 1

NSW Government need to decide on whether they have to build an underground Railway line from either Parramatta, Bankstown or Gosford to central. To prepare a recommendation for this;

  1. Give a numerical summary and an appropriate graphical display for the variables location, by only considering those three stations; and the variablecount by considering the data with trains only.

In this section we first consider the number times the train left the three mentioned locations. This information is given in the table below;

Table 6: Frequency of train from the three locations

 

Count

Percent

Parramatta Station

7

53.8%

Gosford Station

2

15.4%

Bankstown Station

4

30.8%

Figure 2: Bar chart for the count of times the train leaves the stations

Considering the data with trains only, it was established that the average number of counts was 103.38 with the standard deviation of the counts being 226.14

Table 7: Descriptive statistics for the variable count

count

Mean

103.379

Standard Error

7.151282

Median

53

Mode

18

Standard Deviation

226.1434

Sample Variance

51140.84

Kurtosis

238.9731

Skewness

13.04214

Range

4955

Minimum

18

Maximum

4973

Sum

103379

Count

1000

The mode of counts was found to be 18 with the median count being 53. The skewness value indicated that the data is highly and heavily skewed. This is evident from the fact that the minimum count was 18 while the maximum count was 4973. This presents a very huge range which suggests a probable presence of outliers in the dataset hence bringing about the skewness observed.

The histogram presented below further shows that the data is skewed. The shape of the histogram indicates that the data is skewed to the right (longer tail to the right).

Figure 3: Histogram of the variable count

  1. Perform a suitable hypothesis test at a 5% level of significance to test whether there is difference between mean counts of taps on and off.

To answer this, the following the hypothesis was tested at 5% level of significance.

H0: There is no significant difference in the mean counts of taps on and taps off

HA: There is significant difference in the mean counts of taps on and taps off.

To test this, an independent samples t-test was used. The results are given below;

Analysis of Single Variable in Dataset 1

Table 8: Group Statistics

Tap

N

Mean

Std. Deviation

Std. Error Mean

count

On

481

106.65

269.081

12.269

Off

519

100.35

177.530

7.793

Table 9: Independent Samples Test

Levene's Test for Equality of Variances

t-test for Equality of Means

F

Sig.

t

df

Sig. (2-tailed)

Mean Difference

Std. Error Difference

95% Confidence Interval of the Difference

Lower

Upper

count

Equal variances assumed

.083

.774

.440

998

.660

6.296

14.319

-21.802

34.394

Equal variances not assumed

.433

821.5

.665

6.296

14.535

-22.233

34.825

We performed an independent t-test was in order to compare the average number of counts for the taps on and the taps off. Results showed that the average number of counts for the taps on (M = 106.65, SD = 269.08, N = 481) did not significantly differ with the average number of counts for the taps off (M = 100.35, SD = 177.53, N = 519), t (998) = 0.440, p > .05, two-tailed. The mean difference of 6.30 observed was insignificant at 5% level of significance. Essentially the results indicate that whether the taps are on or off does not really affect the number of counts.

  1. Use the conclusion of the test in part b and the outputs in part a to write a recommendation to NSW government.

We concluded that there is no significant difference in the average number of counts for the taps off and taps on. The chosen three stations also did not show much traffic. It is therefore recommended that the government’s plan to build an underground Railway line from either Parramatta, Bankstown or Gosford to central is not as ideal as would be required.

  1. Section 4: Collect and analysis Dataset2

You are interested in finding whether there is a difference in preference between different gender in terms of their transport mode (Bus, Train, Ferry and Light Rail). By considering appropriate number of cases and variable, give a proper graphical display and use it to write a comments.

The results for this section are presented below;

Count of Gender

Column Labels

 

Row Labels

Female

Male

Grand Total

Bus

16.7%

42.3%

30.0%

Ferry

20.8%

7.7%

14.0%

Light Rail

8.3%

11.5%

10.0%

Train

54.2%

38.5%

46.0%

Grand Total

100.00%

100.00%

100.00%

As can be seen, most of the male commuters (42.3%, n = 11) said to use bus while most of the female commuters (54.2%, n = 13) said to use train.

Chi-Square test

A Chi-square test was performed to determine whether there is significant association between gender and the preferred mode of transport (Bagdonavicius & Nikulin, 2011). The hypothesis tested is given below;

H0: There is no significant association between gender and preferred mode of transport

HA: There is significant association between gender and preferred mode of transport

This was tested at 5% level of significance and the results are given below;

Table 10: Chi-Square Tests

Value

df

Asymp. Sig. (2-sided)

Pearson Chi-Square

5.072a

3

.167

Likelihood Ratio

5.239

3

.155

N of Valid Cases

50

a. 4 cells (50.0%) have expected count less than 5. The minimum expected count is 2.40.

The p-value for the test is 0.167 (a value greater than 5% level of significance), we therefore fail to reject the null hypothesis and conclude that there is no evidence that there is significant association between gender and preferred mode of transport.

Section 5: Discussion & Conclusion

The main purpose of this study was to present analysis of NSW transport system.  We were provided with a secondary dataset (dataset 1) that comprised of 1000 cases with six variables. Apart from the provided secondary data on NSW transport system, we also gathered survey on 50 individuals. We sought to fight out the most commonly used mode of transport among the individuals. Results showed that the most commonly used mode of transport was train followed by bus though people used ferry and light rails, their usage was very minimal as compared to the use of bus and train.  In regard to the comparison of the mode of transport in terms of the males and the females using dataset 2, we noted that majority of female respondents  preferred to use the train while most of the male commuters preferred using bus as the mode of transport.  In regard to the findings we would like to make the following recommendations to NSW government;

  • The use of train is very common among the many commuters; it would therefore prudent to improve on this particular mode of transport to make more and more effective. The building of an underground Railway line from either Parramatta, Bankstown or Gosford to central would indeed be a blessing to the commuters.

Future research should be broad enough to even understand the motivation behind the preference for the various mode of transports. This would help the management and the government to fully understand the needs and the desires of the people.

References

Bagdonavicius, V., & Nikulin, M. S. (2011). Chi-squared goodness-of-fit test for right censored data. The International Journal of Applied Mathematics and Statistics, 30–50.

Ryabko, B. Y., Stognienko, V. S., & Shokin, Y. I. (2004). A new test for randomness and its application to some cryptographic problems. Journal of Statistical Planning and Inference, 123, 365–376. doi:10.1016/s0378-3758(03)00149-6

Cite This Work

To export a reference to this article please select a referencing stye below:

My Assignment Help. (2021). Analyzing NSW Transport Data: A Report On Public Transport Preferences And Recommendations For NSW Government. Retrieved from https://myassignmenthelp.com/free-samples/bus708-statistics-and-data-analysis/journal-of-statistical-planning-and-inference.html.

"Analyzing NSW Transport Data: A Report On Public Transport Preferences And Recommendations For NSW Government." My Assignment Help, 2021, https://myassignmenthelp.com/free-samples/bus708-statistics-and-data-analysis/journal-of-statistical-planning-and-inference.html.

My Assignment Help (2021) Analyzing NSW Transport Data: A Report On Public Transport Preferences And Recommendations For NSW Government [Online]. Available from: https://myassignmenthelp.com/free-samples/bus708-statistics-and-data-analysis/journal-of-statistical-planning-and-inference.html
[Accessed 14 November 2024].

My Assignment Help. 'Analyzing NSW Transport Data: A Report On Public Transport Preferences And Recommendations For NSW Government' (My Assignment Help, 2021) <https://myassignmenthelp.com/free-samples/bus708-statistics-and-data-analysis/journal-of-statistical-planning-and-inference.html> accessed 14 November 2024.

My Assignment Help. Analyzing NSW Transport Data: A Report On Public Transport Preferences And Recommendations For NSW Government [Internet]. My Assignment Help. 2021 [cited 14 November 2024]. Available from: https://myassignmenthelp.com/free-samples/bus708-statistics-and-data-analysis/journal-of-statistical-planning-and-inference.html.

Get instant help from 5000+ experts for
question

Writing: Get your essay and assignment written from scratch by PhD expert

Rewriting: Paraphrase or rewrite your friend's essay with similar meaning at reduced cost

Editing: Proofread your work by experts and improve grade at Lowest cost

loader
250 words
Phone no. Missing!

Enter phone no. to receive critical updates and urgent messages !

Attach file

Error goes here

Files Missing!

Please upload all relevant files for quick & complete assistance.

Plagiarism checker
Verify originality of an essay
essay
Generate unique essays in a jiffy
Plagiarism checker
Cite sources with ease
support
close