Problem 2
The production of wine is a multi-billion-dollar worldwide industry. In an attempt to develop a model of wine quality as judged by wine experts, data (see Excel file VinhoVerde) was collected from red wine variants from a particular type of foreign wine. A multiple linear regression model should be developed from a sample of several wines. The model will be used to predict wine quality, measured on a scale from 0 (very bad) to 10 (excellent) based on the alcohol content (%) and the amount of chlorides.
a) State the multiple linear regression equation with the number of independent variables related to the problem in general.
b) Run the Regression Data analysis on this data, provide the obtained model.
c) Interpret the meaning of the parameters β1 and β2 of the model (what does each of them represent?). Does β0 have a practical interpretation?
d) Predict the mean wine quality rating for wines with
(i) 11% alcohol content and chlorides of 0.07.
(ii) 40% alcohol content and chlorides of 0.08.
Comment on whether the model is suitable for the requested predictions.
e) Test the hypothesis on whether the slopes of the model are (jointly) significant.
f) Test the hypothesis on whether each variable in the model (individually) is significant.
g) In case when you find in (e)-(f) that some parameter of the model is insignificant, delete it from the model and run your new model.
Your submission for this problem should include 4-5 pages (in the same pdf file as #1):
1 page with data you use,
1 page with the output from Data Analysis ToolPack you obtain for part (b),
1-2 pages with handwritten or typed responses for parts (a), (c)-(g). The summary should be written in complete sentences stating all obtained results.
The printout of each worksheet should fit into 1 page. Make sure you include all headings (A, B, ….; 1, 2, ) in the printout so that it’s possible to trace the formulae.
Format your work so that it’s readable.