Answer 1: Measures of Central Tendency, Dispersion, Skewness, and Kurtosis
In regards to the memorandum extended concerning the analysis of the automotive CO2 emissions data, the following report has been presented.
Data has been provided about the CO2 emissions witnessed in a sample containing of 1082 vehicles in Canada that were tested in 2015. This typically includes representation from vehicles having 4.6 and 8 cylinders in their respective engines. Further, there is variety of fuel also that is used in these vehicles with four options in the form of regular petrol, premium petrol, diesel and ethanol. Besides the applicable transmission mode may be automatic or manual. Besides, information about engine size has also been provided so as to test the underlying impact of the same on the level of CO2 emissions that the vehicles experienced.
Certain critical observations in wake of the queries posted in your memo are presented below.
The summary data of the CO2 emissions based on the available data of the sample has been computed in the attached excel with focus on measures of central tendency, dispersion along with skew and kurtosis. The histogram has also been drawn using relevant excel functions.
Based on the computed descriptive statistics coupled by the frequency distribution histogram, it is apparent that the given distribution is not normal. This is also indicated by the presence of skew which for a normal distribution has to assume a value of zero. Also, the various measures of central tendency are unequal which reflect on the non-normality of the data. Besides, the peak of the given distribution seems to be lower in comparison with that of normal distribution as the Kurtosis level expected is 3 while that observed for the given data is significantly lower. Also, the standard deviation and variation in the given data seems to be low when seen in perspective by considering the mean. Further, the average value of CO2 emissions is 244.69 g/km which seems to be marginally higher than the corresponding median value standing at 239 g/km.
One of the ways to decipher the relationship between the two given variables of interest is the scatter plot which visually indicates the nature of the relationship between the variables of interest. This is captured in the attached excel sheet.
It is apparent from the above scatter plot that there seems be no decipherable pattern in relationship that may be observed between the fuel types and the CO2 emissions recorded. The assistance of correlation coefficient has been taken to find out the level and nature of association in the two variables of interest. It comes out to be +0.167. A positive magnitude of the correlation coefficient indicates that the magnitude of the CO2 emissions would be lowest for regular petrol (denoted by 1) and then keep on increasing for premium petrol (denoted by 2), diesel (denoted by 3) and ethanol (denoted by 4). While, the higher value for emissions associate with diesel seem justified, but the higher emission levels associated with premium petrol and ethanol is difficult to fathom as these are believed to be cleaner fuels in comparison with diesel and regular petrol. But considering the low correlation witnessed, it is quite possible that the above trend may be attributed to other factors such as usage pattern, engine size and also the number of cylinders in the engine which may affect the overall emissions level.
Answer 2: Correlation and Scatter Plot
- In the given case, the confidence interval needs to be estimated for the corresponding CO2emissions associated with the different number of cylinders in the engine used.
Based on the 95% confidence interval obtained for engines with four cylinders in the attached excel sheet, it would be fair to estimate that there is 95% likelihood that average emission of CO2 from a vehicle having an engine with four cylinders would lie between 198.38 g/km to 203.46 g/km.
Based on the 95% confidence interval obtained for engines with six cylinders in the attached excel sheet, it would be fair to estimate that there is 95% likelihood that average emission of CO2 from a vehicle having an engine with six cylinders would lie between 255.39 g/km to 260.97 g/km. Based on the 95% confidence interval obtained for engines with eight cylinders in the attached excel sheet, it would be fair to estimate that there is 95% likelihood that average emission of CO2 from a vehicle having an engine with eight cylinders would lie between 316.39 g/km to 328.23 g/km.
From the excel computation of respective confidence intervals for the different cylinders in the engine, it seems fair that there is a significant difference with the emissions of CO2 increasing as the cylinders in the engine tend to increase. Thus, for lowering the emissions of CO2, a vehicle running on a four cylinder engine is considered to be more efficient than the corresponding variants with more cylinders.
- The relevant formula for the computation of confidence interval when using proportions =
Further, the relevant computations have been performed in excel to find out mean proportion of the average vehicles with 4, 6, 8 cylinder engine respectively. With regards to the excel computations, it may be estimated with 95% likelihood that the mean proportion of vehicles with engines having 4 cylinders would lie between 42.23% and 48.16%. Further, from the excel output, it may also be estimated with 95% likelihood that the mean proportion of vehicles with engines having 6 cylinders would lie between 32.64% and 38.34%. From the above table, it may be estimated with 95% likelihood that the mean proportion of vehicles with engines having 8 cylinders would lie between 16.96% and 21.67%. Hence, this clearly indicates that there is difference in the prevalence of the vehicles with engine constituting of different cylinders. In this regards, it is clearly noticeable that the incidence of engine containing higher cylinder count is typically less. Further, the higher incidence in terms of vehicle is noticeable for vehicles with engine containing 4 cylinders while the lowest incidence is for vehicles with engine containing 8 cylinders. This augers well for control of pollution as higher cylinder engines tend to have a higher emissions as has been concluded from part (a).
Answer 3: Confidence Intervals for Different Cylinders in the Engine
A claim is made in the given case as per which it is stated that there are atleast 5% vehicles which have CO2 emissions in excess of 350 grams per kilometer. This claim now needs to be tested in wake of the given sample.
The relevant hypotheses for the testing of the claim are indicated below.
Null Hypothesis: p ≤ 0.05 i.e. the proportion of vehicles having CO2 emissions greater than 350 g/km is not more than 5% of the total vehicles.
Alternative Hypothesis: p > 0.05 i.e. the proportion of vehicles having CO2 emissions greater than 350 g/km is more than 5% of the total vehicles.
The relevant hypothesis testing along with the decision reached is captured in the excel attached which first computes the relevant test statistic and thus computes the critical value and p value while also giving the decision to reject or not reject the null hypothesis. From the excel output, it is evident that there is an apparent failure to reject the null hypothesis in the given case which implies that the alternative hypothesis cannot be accepted. This has been caused since the relevant p value is greater than the level of significance selected for the hypothesis testing.This leads to the conclusion that the proportion of vehicles with CO2 emissions greater than 350 g/km would not be more than 5%. Hence, the given claim is incorrect and would be rejected.
- The objective in this case was to analyze the casual relationship between engine size and CO2emissions through the use of linear regression model. Based on the relevant excel output that is obtained in this regard, the relation between the two may be summarized as follows.
CO2 emissions (g/km) = 130.319 + 36.594*Engine Size
Further, the coefficient of determination or R2 value for the above model is 0.7017 which implies that 70.17% of the variation in CO2 emissions would be explained on the basis of the corresponding variations in engine size. Also, considering that the p value associated with the slope coefficient is zero, it is apparent that engine size is significant as a determinant of CO2 emissions. Besides, the regression model is also significant taking into consideration the F value indicated in the excel output along with the corresponding p value for the test. Here the p value associated with the F test has found to be lower than the assumed significance level. Hence, it would be appropriate to conclude that higher engine size does lead to increase in the CO2 emissions.
- There should not be any concerns for the usage of the above model to estimate the CO2emissions of a 1000cc or 1 liter engine. This is because the given value falls within the range of the values from which the above regression equation has been derived. For the given model, there are vehicles which have lower than 1 liter engine size and also greater than 1 liter and thus estimating for the current engine should not be a matter of concern. The concern would have only arisen if the independent variable i.e. the capacity of the engine would have been outside the range of values taken into consideration for creation of the regression model.
- The minimum sample size can be determined using the following formula.
Minimum sample size = Z2*p*(1-p)/c2
Assuming a 95% level of confidence, Z = 1.96
Also, p = 40/1082 = 0.03697
Further (1-p ) = 1-0.03697 = 0.963
Thus, the minimum sample size required = (1.962*0.03697*0.963)/0.032 = 152
- The minimum sample size can be determined using the following formula.
Minimum sample size = Z2*B2/c2
Where B is the standard deviation
And c is the tolerance limit
Assuming a 95% level of confidence, Z = 1.96
Hence, minimum sample size = 1.962*2.8972/0.52 = 129
From the above, it is fair to conclude that a lower sample size of 152 would be sufficient based on the above results and therefore a lower sample size than the current sample would work fine.
On the basis of the above results, it is fair to estimate the distribution of CO2 emissions is not normal as evident from the descriptive statistics computed. Also, the average value of CO2 is greater than the median value. Besides, the relationship between type of fuel and CO2 emissions is not significant as it seems very small and contradicts the available literature where alternative fuels like ethanol are considered to be cleaner fuels. Further, there is significant different in the respective confidence interval of CO2 emissions leading to the conclusion that higher cylinders in the engine lead to higher emissions. Besides, with regards to number of vehicles with different engines, it is apparent that vehicles with 4 cylinder engine has the highest incidence while the vehicles with 8 cylinder engine has the lowest incidence. Also, the claim made with regards to more than 5% vehicles emitting CO2 at a rate of more 350 g/km has proven to be false as ascertained using hypothesis testing as the key enabling tool. Additionally, there is a significant relation between the engine size and CO2 emissions with larger engines leading to higher emissions. Besides, it is also permissible to use the given regression model to estimate the emissions of a 1000 cc engine as it belongs to the range of independent variables used. Lastly, the current sample size of 1,082 values seems to be in excess of the minimum sample size which would be required and tends to be significantly lower at 152.This needs to be kept in mind for the research next year so as to save on incremental effort and cost.
To export a reference to this article please select a referencing stye below:
My Assignment Help. (2022). Analysis Of Automotive CO2 Emissions Data. Retrieved from https://myassignmenthelp.com/free-samples/mis770-foundation-skills-in-data-analysis/memorandum-vehicular-co2-cmissions-file-F97374.html.
"Analysis Of Automotive CO2 Emissions Data." My Assignment Help, 2022, https://myassignmenthelp.com/free-samples/mis770-foundation-skills-in-data-analysis/memorandum-vehicular-co2-cmissions-file-F97374.html.
My Assignment Help (2022) Analysis Of Automotive CO2 Emissions Data [Online]. Available from: https://myassignmenthelp.com/free-samples/mis770-foundation-skills-in-data-analysis/memorandum-vehicular-co2-cmissions-file-F97374.html
[Accessed 23 February 2024].
My Assignment Help. 'Analysis Of Automotive CO2 Emissions Data' (My Assignment Help, 2022) <https://myassignmenthelp.com/free-samples/mis770-foundation-skills-in-data-analysis/memorandum-vehicular-co2-cmissions-file-F97374.html> accessed 23 February 2024.
My Assignment Help. Analysis Of Automotive CO2 Emissions Data [Internet]. My Assignment Help. 2022 [cited 23 February 2024]. Available from: https://myassignmenthelp.com/free-samples/mis770-foundation-skills-in-data-analysis/memorandum-vehicular-co2-cmissions-file-F97374.html.