Title: | Companion Package for the Book "Model-Based Clustering and Classification for Data Science" by Bouveyron et al. (2019, ISBN:9781108644181). |
---|---|
Description: | The companion package provides all original data sets and functions that are used in the book "Model-Based Clustering and Classification for Data Science" by Charles Bouveyron, Gilles Celeux, T. Brendan Murphy and Adrian E. Raftery (2019, ISBN:9781108644181). |
Authors: | Charles Bouveyron [cre, aut], Gilles Celeux [aut], T. Brendan Murphy [aut], Adrian Raftery [aut] |
Maintainer: | Charles Bouveyron <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.1.2 |
Built: | 2024-11-14 04:13:52 UTC |
Source: | https://github.com/cbouveyron/mbcbook |
The companion package provides all original data sets and functions that are used in the book "Model-Based Clustering and Classification for Data Science" by Charles Bouveyron, Gilles Celeux, T. Brendan Murphy and Adrian E. Raftery (2019, ISBN:9781108644181).
The DESCRIPTION file:
Encoding: | UTF-8 |
Package: | MBCbook |
Type: | Package |
Title: | Companion Package for the Book "Model-Based Clustering and Classification for Data Science" by Bouveyron et al. (2019, ISBN:9781108644181). |
Version: | 0.1.2 |
Date: | 2024-05-06 |
Authors@R: | c( person("Charles", "Bouveyron", , "[email protected]", role = c("cre", "aut")), person("Gilles", "Celeux", , "[email protected]", role = "aut"), person("T. Brendan", "Murphy", , "[email protected]", role = "aut"), person("Adrian", "Raftery", , "[email protected]", role = "aut")) |
Depends: | R (>= 3.1.0), mclust, Rmixmod, MASS, mvtnorm, Matrix |
Suggests: | network, jpeg |
Description: | The companion package provides all original data sets and functions that are used in the book "Model-Based Clustering and Classification for Data Science" by Charles Bouveyron, Gilles Celeux, T. Brendan Murphy and Adrian E. Raftery (2019, ISBN:9781108644181). |
License: | GPL (>= 2) |
NeedsCompilation: | no |
URL: | https://github.com/cbouveyron/MBCbook |
BugReports: | https://github.com/cbouveyron/MBCbook/issues |
Config/pak/sysreqs: | make |
Repository: | https://cbouveyron.r-universe.dev |
RemoteUrl: | https://github.com/cbouveyron/mbcbook |
RemoteRef: | HEAD |
RemoteSha: | eae6fd7f42e34060dace15a00709502282d7f100 |
Author: | Charles Bouveyron [cre, aut], Gilles Celeux [aut], T. Brendan Murphy [aut], Adrian Raftery [aut] |
Maintainer: | Charles Bouveyron <[email protected]> |
Index of help topics:
AIDSBlogs The AIDSBlogs data set Advice The Advice data set from Lazega (2001) Coworker The Coworker data set from Lazega (2001) Friend The Friend data set from Lazega (2001) MBCbook-package Companion Package for the Book "Model-Based Clustering and Classification for Data Science" by Bouveyron et al. (2019, ISBN:9781108644181). NIR The chemometrics near-infrared (NIR) data set PoliticalBlogs The political blog data set UScongress The US congress vote data set amazonFineFoods The Amazon Fine Foods data set constrEM Semi-supervised clustering with must-link constraints credit The Credit data set denoisePatches Denoising of image patches imageToPatch Transform an image into a collection of patches imshow Display an image puffin The puffin data set reconstructImage Reconstructing an image from a patch decomposition rqda Robust (quadratic) discriminant analysis usps358 The handwritten digits usps358 data set varSelEM A variable selection algorithm for clustering velib2D The bivariate Vélib data set velibCount The discrete version (count data) of the Vélib data set wine27 The (27-dimensional) Italian Wine data set
Charles Bouveyron [cre, aut], Gilles Celeux [aut], T. Brendan Murphy [aut], Adrian Raftery [aut]
Maintainer: Charles Bouveyron <[email protected]>
Charles Bouveyron and Gilles Celeux and T. Brendan Murphy and Adrian E. Raftery, Model-Based Clustering and Classification for Data Science: with Applications in R, Cambridge University Press, 2019.
Lazega (2001) <doi:10.2307/3556688> collected a network data set detailing interactions between a set of 71 lawyers in a corporate law firm in the USA. The data include measurements of the advice network, friendship network and co-worker network between the lawyers within the firm. Further covariates associated with each lawyer in the firm are also available including age, seniority, college education and office location.
data("Advice")
data("Advice")
A large network object, which can be managed with the network library, with 71 nodes.
Lazega, E., The Collegial Phenomenon: The Social Mechanisms of Cooperation Among Peers in a Corporate Law Partnership, Oxford University Press, 2001 <doi:10.2307/3556688>.
data(Advice)
data(Advice)
The AIDS blog data set records the pattern of citation among 146 unique blogs related to AIDS patients and their support networks. The data were originally collected by Gopal (2007) <doi:10.1007/1-4020-5427-0_18> over a randomly selected three-day period in August 2005. The nodes in the network correspond to blogs and a directed edge from one blog to another indicates that the former had a link to the latter in their web page.
data("AIDSBlogs")
data("AIDSBlogs")
A large network object, which can be managed with the network library, with 146 nodes.
Gopal, S., The evolving social geography of blogs, in Miller, H. J. (ed.), Societies and Cities in the Age of Instant Access, The GeoJournal Library, vol. 88., pp. 275–293, 2007 <doi:10.1007/1-4020-5427-0_18>.
data(AIDSBlogs)
data(AIDSBlogs)
The Amazon Fine Foods data set has 1646 rows and 1735 columns, describing whether an user (row) has noted and reviewed a product (column) or not.
data("amazonFineFoods")
data("amazonFineFoods")
A data frame with binary values indicating whether an user (row) has noted and reviewed a product (column) or not.
https://snap.stanford.edu/data/web-FineFoods.html.
data(amazonFineFoods)
data(amazonFineFoods)
Semi-supervised clustering with must-link constraints allows to cluster data for which must-link constraints are available. This function implements the method described in Shental et al. (2003, ISBN:9781615679119).
constrEM(X, K, C, maxit = 30)
constrEM(X, K, C, maxit = 30)
X |
a data frame of observations, assuming the rows are the observations and the columns the variables. Note that NAs are not allowed. |
K |
the number of desired groups. |
C |
a vector encoding the must-link constraints through chuncklets. This vector has to be of the length of the number of observations. Two observations that have to be in the same group must be in the same chuncklet. For instance, the chuncklet vector (1,2,3,4,3,5) indicate that 3rd and the 5th observations have a must-link constraint. If there is no must-link constraints, this vector should be simply 1:nrow(X). |
maxit |
the maximum number of iterations. |
A list is returned with the following fields:
cls |
a vector containg the group memberships of the observations. |
T |
the posterior probabilities that the observations belong to the K groups. |
prop |
the estimated mixture proportions. |
mu |
the estimated mixture means. |
S |
the estimated mixture covariance matrices. |
ll |
the log-likelihood value at convergence. |
C. Bouveyron
This function implements the method described in Shental, N., Bar-Hillel, A., Hertz, T., and Weinshall, D., Computing Gaussian mixture models with EM using equivalence constraints, Proceedings of the 16th International Conference on Neural Information Processing Systems, pages 465–472, 2003 (ISBN:9781615679119).
# Simulation of some data set.seed(123) n = 200 m1 = c(0,0); m2 = 4*c(1,1); m3 = 4*c(1,1) S1 = diag(2); S2 = rbind(c(1,0),c(0,0.05)) S3 = rbind(c(0.05,0),c(0,1)) X = rbind(mvrnorm(n,m1,S1),mvrnorm(n,m2,S2),mvrnorm(n,m3,S3)) cls = rep(1:3,c(n,n,n)) # Encoding the constraints through chunklets # Observations 397 and 408 are in the same chunklet a = 398 b = 430 C = c(1:(b-1),a,b:(nrow(X)-1)) # Clustering with constrEM res = constrEM(X,K=3,C,maxit=20)
# Simulation of some data set.seed(123) n = 200 m1 = c(0,0); m2 = 4*c(1,1); m3 = 4*c(1,1) S1 = diag(2); S2 = rbind(c(1,0),c(0,0.05)) S3 = rbind(c(0.05,0),c(0,1)) X = rbind(mvrnorm(n,m1,S1),mvrnorm(n,m2,S2),mvrnorm(n,m3,S3)) cls = rep(1:3,c(n,n,n)) # Encoding the constraints through chunklets # Observations 397 and 408 are in the same chunklet a = 398 b = 430 C = c(1:(b-1),a,b:(nrow(X)-1)) # Clustering with constrEM res = constrEM(X,K=3,C,maxit=20)
Lazega (2001) <doi:10.2307/3556688> collected a network data set detailing interactions between a set of 71 lawyers in a corporate law firm in the USA. The data include measurements of the advice network, friendship network and co-worker network between the lawyers within the firm. Further covariates associated with each lawyer in the firm are also available including age, seniority, college education and office location.
data("Coworker")
data("Coworker")
A large network object, which can be managed with the network library, with 71 nodes.
Lazega, E., The Collegial Phenomenon: The Social Mechanisms of Cooperation Among Peers in a Corporate Law Partnership, Oxford University Press, 2001 <doi:10.2307/3556688>.
data(Coworker)
data(Coworker)
The Credit data set has 66 rows and 11 columns, describing customers who took out loans from a credit company described with 11 categorical or ordinal variables.
data("credit")
data("credit")
A data frame with 66 observations and 11 categorical or ordinal variables.
https://husson.github.io/data.html
data(credit)
data(credit)
Denoising of image patches based on the clustering of patches.
denoisePatches(Y,out,P,sigma=10)
denoisePatches(Y,out,P,sigma=10)
Y |
a data frame containing as rows the image patches to denoise |
out |
the mixmodCluster object that contains mixture parameters |
P |
the posterior probabilities that patches belong to the clusters |
sigma |
the noise standard deviation |
A data fame of the denoised patches is returned.
C. Bouveyron & J. Delon
Im = diag(16) ImNoise = Im + rnorm(256,0,0.1) X = imageToPatch(ImNoise,4) out = mixmodCluster(X,10,model=mixmodGaussianModel(family=c("spherical"))) res = mixmodPredict(X,out@bestResult) Xdenoised = denoisePatches(X,out,P = res@proba,sigma = 0.1) ImRec = reconstructImage(Xdenoised,16,16) par(mfrow=c(1,3)); imshow(Im); imshow(ImNoise); imshow(ImRec)
Im = diag(16) ImNoise = Im + rnorm(256,0,0.1) X = imageToPatch(ImNoise,4) out = mixmodCluster(X,10,model=mixmodGaussianModel(family=c("spherical"))) res = mixmodPredict(X,out@bestResult) Xdenoised = denoisePatches(X,out,P = res@proba,sigma = 0.1) ImRec = reconstructImage(Xdenoised,16,16) par(mfrow=c(1,3)); imshow(Im); imshow(ImNoise); imshow(ImRec)
Lazega (2001) <doi:10.2307/3556688> collected a network data set detailing interactions between a set of 71 lawyers in a corporate law firm in the USA. The data include measurements of the advice network, friendship network and co-worker network between the lawyers within the firm. Further covariates associated with each lawyer in the firm are also available including age, seniority, college education and office location.
data("Friend")
data("Friend")
A large network object, which can be managed with the network library, with 71 nodes.
Lazega, E., The Collegial Phenomenon: The Social Mechanisms of Cooperation Among Peers in a Corporate Law Partnership, Oxford University Press, 2001 <doi:10.2307/3556688>.
data(Friend)
data(Friend)
Transform an image into a collection of small images (patches) that cover the original image.
imageToPatch(Im,f)
imageToPatch(Im,f)
Im |
the image for which one wants to extract local patches. |
f |
the size of the desired patches (fxf). |
A data frame of all extracted patches is returned.
C. Bouveyron & J. Delon
Im = diag(16) ImNoise = Im + rnorm(256,0,0.1) X = imageToPatch(ImNoise,4) out = mixmodCluster(X,10,model=mixmodGaussianModel(family=c("spherical"))) res = mixmodPredict(X,out@bestResult) Xdenoised = denoisePatches(X,out,P = res@proba,sigma = 0.1) ImRec = reconstructImage(Xdenoised,16,16) par(mfrow=c(1,3)); imshow(Im); imshow(ImNoise); imshow(ImRec)
Im = diag(16) ImNoise = Im + rnorm(256,0,0.1) X = imageToPatch(ImNoise,4) out = mixmodCluster(X,10,model=mixmodGaussianModel(family=c("spherical"))) res = mixmodPredict(X,out@bestResult) Xdenoised = denoisePatches(X,out,P = res@proba,sigma = 0.1) ImRec = reconstructImage(Xdenoised,16,16) par(mfrow=c(1,3)); imshow(Im); imshow(ImNoise); imshow(ImRec)
A simple way of displaying an image, using the image
function.
imshow(x,col=palette(gray(0:255/255)),useRaster = TRUE,...)
imshow(x,col=palette(gray(0:255/255)),useRaster = TRUE,...)
x |
the image to display as a matrix. |
col |
the color palette to use when displaying the image. |
useRaster |
logical; if TRUE a bitmap raster is used to plot the image instead of polygons. The grid must be regular in that case, otherwise an error is raised. For the behaviour when this is not specified, see the ‘Details’ section of the |
... |
additionial arguments to provide to subfunctions. |
Im = diag(16) imshow(Im)
Im = diag(16) imshow(Im)
The chemometrics near-infrared (NIR) data set has 202 observations and 2801 variables: 2800 near-infrared wavelength measures and 1 class variable. The data were obtained from the analysis of three types of textiles. The data set was first introduce in Devos et al. (2009) <doi:10.1016/j.chemolab.2008.11.005>.
data("velibCount")
data("velibCount")
A data frame with 202 observations and 2801 variables. The first variable indicates the class-memberships of the observations.
Devos, O., Ruckebusch, C., Durand, A., Duponchel, L., and Huvenne, J.-P., Support vector machines (SVM) in near infrared (NIR) spectroscopy: Focus on parameters optimization and model interpretation, Chemometrics and Intelligent Laboratory Systems, 96, 27–33, 2009 <doi:10.1016/j.chemolab.2008.11.005>.
data(NIR) matplot(t(NIR[,-1]),type='l',col=NIR[,1])
data(NIR) matplot(t(NIR[,-1]),type='l',col=NIR[,1])
The political blog data set shows the linking structure in online blogs which commentate on French political issues; the data were collected by Observatoire Presidentielle in October 2006. The data were first used by Latouche et al. (2011) <doi:10.1214/10-AOAS382>.
data("PoliticalBlogs")
data("PoliticalBlogs")
A large network object, which can be managed with the network library, with 196 nodes.
P. Latouche, E. Birmelé, and C. Ambroise. "Overlapping stochastic block models with application to the French political blogosphere". In : Annals of Applied Statistics 5.1, p. 309-336, 2011 <doi:10.1214/10-AOAS382>.
data(PoliticalBlogs) # Visualization with the network library library(network) plot(PoliticalBlogs)
data(PoliticalBlogs) # Visualization with the network library library(network) plot(PoliticalBlogs)
The puffin data set contains 69 individuals (birds) described by 5 categorical variables, in addition to class labels.
data("puffin")
data("puffin")
A data frame with 69 observations and 6 variables.
class
the class of the observations
gender
gender of the bird
eyebrow
gender of the bird
collar
gender of the bird
sub.caudal
gender of the bird
border
gender of the bird
The data were provided by Bretagnolle, V., Museum d'Histoire Naturelle, Paris.
data(puffin)
data(puffin)
A simple way of reconstructing an image from a patch decomposition.
reconstructImage(X,nl,nc)
reconstructImage(X,nl,nc)
X |
the matrix of patches to be used for reconstructing the image. |
nl |
the number of rows of the image. |
nc |
the number of columns of the image. |
an image is returned as a matrix object, that can be display with the imshow
function.
C. Bouveyron & J. Delon
Im = diag(16) ImNoise = Im + rnorm(256,0,0.1) X = imageToPatch(ImNoise,4) out = mixmodCluster(X,10,model=mixmodGaussianModel(family=c("spherical"))) res = mixmodPredict(X,out@bestResult) Xdenoised = denoisePatches(X,out,P = res@proba,sigma = 0.1) ImRec = reconstructImage(Xdenoised,16,16) par(mfrow=c(1,3)); imshow(Im); imshow(ImNoise); imshow(ImRec)
Im = diag(16) ImNoise = Im + rnorm(256,0,0.1) X = imageToPatch(ImNoise,4) out = mixmodCluster(X,10,model=mixmodGaussianModel(family=c("spherical"))) res = mixmodPredict(X,out@bestResult) Xdenoised = denoisePatches(X,out,P = res@proba,sigma = 0.1) ImRec = reconstructImage(Xdenoised,16,16) par(mfrow=c(1,3)); imshow(Im); imshow(ImNoise); imshow(ImRec)
Robust (quadratic) discriminant analysis implements a discriminant analysis method which is robust to label noise. This function implements the method described in Lawrence and Scholkopf (2003, ISBN:1-55860-778-1).
rqda(X,lbl,Y,maxit=50,disp=FALSE,...)
rqda(X,lbl,Y,maxit=50,disp=FALSE,...)
X |
a data frame containing the learning observations. |
lbl |
the class labels of the learning observations. |
Y |
a data frame containing the new observations to classify. |
maxit |
the maximum number of iterations. |
disp |
logical, if |
... |
additional arguments to provide to subfunctions. |
A list is returned with the following elements:
nu |
the estimated class proportions. |
mu |
the estimated class means. |
S |
the estimated covariance matrices. |
gamma |
the estimated purity level of the labels. |
Ti |
the posterior probabilties of the labels knowing the observed labels for the learning observations. |
Pi |
the class posterior probabilities of the observations to classify. |
cls |
the class assignments of the observations to classify. |
ll |
the log-likelihood value. |
C. Bouveyron
Lawrence, N., and Scholkopf, B., Estimating a kernel Fisher discriminant in the presence of label noise, Pages 306–313 of: Proceedings of the Eighteenth International Conference on Machine Learning. ICML’01. San Francisco, CA, USA, 2001 (ISBN:1-55860-778-1).
n = 50 m1 = c(0,0); m2 = 1.5*c(1,-1) S1 = 0.1*diag(2); S2 = 0.25 * diag(2) X = rbind(mvrnorm(n,m1,S1),mvrnorm(2*n,m2,S2)) cls = rep(1:2,c(n,2*n)) # Label perturbation ind = rbinom(3*n,1,0.4); lb = cls lb[ind==1 & cls==1] = 2 lb[ind==1 & cls==2] = 1 # Classification with RQDA res = rqda(X,lb,X) table(cls,res$cls)
n = 50 m1 = c(0,0); m2 = 1.5*c(1,-1) S1 = 0.1*diag(2); S2 = 0.25 * diag(2) X = rbind(mvrnorm(n,m1,S1),mvrnorm(2*n,m2,S2)) cls = rep(1:2,c(n,2*n)) # Label perturbation ind = rbinom(3*n,1,0.4); lb = cls lb[ind==1 & cls==1] = 2 lb[ind==1 & cls==2] = 1 # Classification with RQDA res = rqda(X,lb,X) table(cls,res$cls)
The US congress vote data set contains the votes (yes, no, abstained or absent) of 434 members of the 98th US Congress on 16 different key issues. This data set involves three-level categorical data.
data("UScongress")
data("UScongress")
A data frame with 434 observations on 16 different key issues. The first variables indicates the political party of the congressmen.
http://archive.ics.uci.edu/ml/datasets/Congressional+Voting+Records
data(UScongress)
data(UScongress)
The handwritten digits usps358 data set is a subset of the famous USPS data from UCI, which contains only the 1 756 images of the digits 3, 5 and 8.
data("usps358")
data("usps358")
A data frame with 1756 observations on the following 257 variables: cls
is a numeric vector encoding the class of the digits, V1
to V256
are numeric vectors corresponding to the pixels ot the 8x8 images.
The data set is a subset of the famous USPS data from UCI (https://archive.ics.uci.edu/ml/index.php). The usps358 data set contains only the 1 756 images of the digits 3, 5 and 8 which are the most difficult digits to discriminate.
data(usps358)
data(usps358)
A variable selection algorithm for clustering which implements the method described in Law et al. (2004) <doi:10.1109/TPAMI.2004.71>.
varSelEM(X,G,maxit=100,eps=1e-6)
varSelEM(X,G,maxit=100,eps=1e-6)
X |
a data frame containing the observations to cluster. |
G |
the expected number of groups (integer). |
maxit |
the maximum number of iterations (integer). The default value is 100. |
eps |
the convergence threshold. The default value is 1e-6. |
A list is returned with the following elements:
mu |
the group means for relevant variables. |
sigma |
the group variances for relevant variables. |
lambda |
the group means for irrelevant variables |
alpha |
the group variances for irrelevant variables. |
rho |
the feature saliency. |
P |
the group posterior probabilities. |
cls |
the group memberships. |
ll |
the log-likelihood value. |
C. Bouveyron
Law, M. H., Figueiredo, M. A. T., and Jain, A. K., Simultaneous feature selection and clustering using mixture models, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, pp. 1154–1166, 2004 <doi:10.1109/TPAMI.2004.71>.
data(wine27) X = scale(wine27[,1:27]) cls = wine27$Type # Clustering and variable selection with VarSelEM res = varSelEM(X,G=3) # Clustering table table(cls,res$cls)
data(wine27) X = scale(wine27[,1:27]) cls = wine27$Type # Clustering and variable selection with VarSelEM res = varSelEM(X,G=3) # Clustering table table(cls,res$cls)
The bivariate Vélib data set contains data from the bike sharing system of Paris, called Vélib. The data are loading profiles and percentage of broken docks of the bike stations over one week. The data were collected every hour during the period Sunday 1st Sept. - Sunday 7th Sept., 2014. The data were first used in Bouveyron et al. (2015) <doi:10.1214/15-AOAS861>.
data("velib2D")
data("velib2D")
The format is:
- availableBikes: the loading profiles (nb of available bikes / nb of bike docks) of the 1189 stations at 181 time points.
- brokenDockss: the percentage of broken docks of the 1189 stations at 181 time points.
- position: the longitude and latitude of the 1189 bike stations.
- dates: the download dates.
- bonus: indicates if the station is on a hill (bonus = 1).
- names: the names of the stations.
The real time data are available at https://developer.jcdecaux.com/ (with an api key).
The data were first used in C. Bouveyron, E. Côme and J. Jacques, The discriminative functional mixture model for the analysis of bike sharing systems, The Annals of Applied Statistics, vol. 9 (4), pp. 1726-1760, 2015 <doi:10.1214/15-AOAS861>.
data(velib2D)
data(velib2D)
The discrete version (count data) of Vélib data set contains data from the bike sharing system of Paris, called Vélib. The data consist in the number of bikes at stations over one week. The data were collected every hour during the period Sunday 1st Sept. - Sunday 7th Sept., 2014. The data were first used in Bouveyron et al. (2015) <doi:10.1214/15-AOAS861>.
data("velibCount")
data("velibCount")
The format is:
- data: the nb of available bikes of the 1189 stations at 181 time points.
- position: the longitude and latitude of the 1189 bike stations.
- dates: the download dates.
- bonus: indicates if the station is on a hill (bonus = 1).
- names: the names of the stations.
The real time data are available at https://developer.jcdecaux.com/ (with an api key).
The data were first used in C. Bouveyron, E. Côme and J. Jacques, The discriminative functional mixture model for the analysis of bike sharing systems, The Annals of Applied Statistics, vol. 9 (4), pp. 1726-1760, 2015 <doi:10.1214/15-AOAS861>.
data(velib2D)
data(velib2D)
The (27-dimensional) Italian Wine data set is the result of a chemical analysis of 178 wines grown in the same region in Italy but derived from three different cultivars. The analysis determined the quantities of 27 constituents found in each of the three types of wines.
data("wine27")
data("wine27")
A data frame with 178 observations on the following 29 variables.
Alcohol
a numeric vector
Sugar.free_extract
a numeric vector
Fixed_acidity
a numeric vector
Tartaric_acid
a numeric vector
Malic_acid
a numeric vector
Uronic_acids
a numeric vector
pH
a numeric vector
Ash
a numeric vector
Alcalinity_of_ash
a numeric vector
Potassium
a numeric vector
Calcium
a numeric vector
Magnesium
a numeric vector
Phosphate
a numeric vector
Chloride
a numeric vector
Total_phenols
a numeric vector
Flavanoids
a numeric vector
Nonflavanoid_phenols
a numeric vector
Proanthocyanins
a numeric vector
Color_Intensity
a numeric vector
Hue
a numeric vector
OD280.OD315_of_diluted_wines
a numeric vector
OD280.OD315_of_flavanoids
a numeric vector
Glycerol
a numeric vector
X2.3.butanediol
a numeric vector
Total_nitrogen
a numeric vector
Proline
a numeric vector
Methanol
a numeric vector
Type
a factor with levels Barbera
, Barolo
, Grignolino
Year
a numeric vector
This data set is an expended version of the popular one from the UCI machine learning repository (http://archive.ics.uci.edu/ml/datasets/Wine).
data(wine27)
data(wine27)