WEP – Wine Economics and Policy Just Accepted Manuscript 1 Just accepted 1 2 3 4 Consistency of expert product reviews: an application to wine guides 5 Gabriel I. Penagos-Londoño1, Felipe Ruiz-Moreno2, Ricardo Sellers-Rubio3, Salvador Del Barrio-6 García4, Ana B. Casado-Díaz5 7 8 1 Pontifical Xavierian University, Department of Economics, Carrera 7 No. 40 – 62, Bogota – 9 COLOMBIA, Email: penagosi@javeriana.edu.co 10 2 University of Alicante, Department of Marketing, Crta. San Vicente s/n. 03690, Alicante – SPAIN, 11 Email: felipe.ruiz@ua.es 12 3 University of Alicante, Department of Marketing, Crta. San Vicente s/n. 03690, Alicante – SPAIN, 13 Email: ricardo.sellers@ua.es 14 4 University of Granada, Department of Marketing and Marketing Research, Campus Universitario 15 de Cartuja, 18071, Granada – SPAIN, Email: dbarrio@ugr.es 16 5 University of Alicante, Department of Marketing, Crta. San Vicente s/n. 03690, Alicante – SPAIN, 17 Email: ana.casado@ua.es 18 19 Correspondence concerning this article should be addressed to Felipe Ruiz-Moreno, University of 20 Alicante, Department of Marketing, Crta. San Vicente s/n. 03690, Alicante – SPAIN, Email: 21 felipe.ruiz@ua.es 22 23 This article has been accepted for publication and undergone full peer review but has not been through 24 the copyediting, typesetting, pagination and proofreading process, which may lead to differences 25 between this version and the Version of Record. 26 27 Please cite this article as: 28 29 Penagos-Londoño G.I., Ruiz-Moreno F., Sellers-Rubio R., Del Barrio-García S., Casado-Díaz A.B. 30 (2022), Consistency of expert product reviews: an application to wine guides, Wine Economics and 31 Policy, Just Accepted. 32 DOI: 10.36253/wep-12400 33 34 35 36 37 38 mailto:penagosi@javeriana.edu.co mailto:felipe.ruiz@ua.es mailto:ricardo.sellers@ua.es mailto:dbarrio@ugr.es mailto:ana.casado@ua.es mailto:felipe.ruiz@ua.es WEP – Wine Economics and Policy Just Accepted Manuscript 2 Abstract 39 Purpose. The purpose of this study is to examine the internal consistency of wine guides by 40 comparing the judgements of expert wine tasters and reviewers. A classification of wines is provided 41 to establish whether expert reviews of similar wines are coherent. 42 Design/methodology/approach. Sentiment analysis based on natural language processing 43 techniques was used to compare quantitative and qualitative reviews between experts. In addition, a 44 finite mixture model was used to classify wines into categories to analyse internal consistency 45 between ratings. 46 Findings. The results for a sample of more than 200,000 Wine Enthusiast ratings reveal significant 47 differences between expert reviews. This finding indicates that there are no standard criteria for 48 reviewing wines included in the guide. 49 Originality. Wine guides are amongst the most widely used marketing resources in the wine industry. 50 They provide a signal to consumers about the quality of wines, guiding their purchase decisions. They 51 also influence the reputation of brands and the performance of companies producing these wines. The 52 main contribution of this study is to propose a new way to compare the reviews of wine guide experts. 53 54 Keywords: reputation, wine, expert ratings, sentiment analysis, finite mixture model, wine guides 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 WEP – Wine Economics and Policy Just Accepted Manuscript 3 1. Introduction 73 Information influences users’ decision-making processes. However, information asymmetry 74 generally exists in the buyer-seller relationship because each party has a different amount of 75 information about products [1]. Research on experiential and hedonic consumption has shown that 76 consumers’ behaviour is affected by “social influence including peer input (word-of-mouth) and 77 judgments of respected experts (professional evaluations)” [2, p.180]. 78 Wine is an experience product whose quality cannot be assessed by consumers before purchase and 79 consumption [3, 4]. This feature of wine increases the complexity of the purchase decision process. 80 Thus, information asymmetries arise between consumers and winemakers in relation to product 81 quality. Accordingly, high- and low-quality products can coexist in the market [5]. Wineries employ 82 different marketing strategies to reduce these asymmetries and inform the market about the quality 83 of their products [6]. Some use advertising in the mainstream media and encourage positive word-of-84 mouth communication amongst consumers [7, 8]. They also use awards in national and international 85 competitions as part of their branding and communication strategies [6]. Finally, receiving high 86 ratings in well-known wine guides, which are managed by experts and prescribers, can also help 87 reduce information asymmetries between winemakers and consumers. 88 This study focuses on the social influence of experts in wine guides. Wine guides offer thousands of 89 reviews of wines from around the world, basing their reviews on the opinions of panels of experts 90 who taste these wines. The assumption is that consumers use judgements of wine quality by expert 91 reviewers in wine guides as a source of information to make purchase decisions [9]. These expert 92 reviewers might consequently influence the performance of the wine-producing companies. Previous 93 research has in fact shown that there is a relationship between online reviews and consumer choice 94 and firm sales [10, 11]. However, despite the potential impact on consumers and wineries, the nature 95 and effects of expert opinions in wine guides remains an under-researched topic. 96 Wine experts usually provide a quantitative (score) and a qualitative (comment) review. The aim of 97 this study is to test the consistency between these two assessments (quantitative and qualitative) of 98 tasted wines. For wine guides to offer a credible source of information, both assessments of the same 99 wine should match. That is, higher scores should be aligned with more positive comments. This 100 analysis can confirm the role of expert evaluations as a credible source of information for consumers. 101 To test the consistency of wine experts’ reviews, the qualitative content (i.e. tasting notes) is 102 examined using sentiment analysis based on natural language processing techniques. Then, these 103 reviews and other relevant variables (origin and grape variety) are used to establish whether expert 104 reviews of similar wines are coherent. Coherence is examined by classifying wines according to 105 reviews and wine-related variables. A finite mixture model is employed for this classification. The 106 WEP – Wine Economics and Policy Just Accepted Manuscript 4 study context is the Wine Enthusiast guide, one of the most prestigious wine guides in the world. The 107 results show significant differences between expert reviews, which raises doubts about the usefulness 108 and credibility of wine guides as a source of information. 109 110 2. Literature review 111 2.1. Wine guides as a marketing tool 112 Guides are extremely popular in the wine industry because they offer a point of comparison across 113 brands [12] and provide consumers with a signal of wine quality. Wine guides are based on the 114 opinions of experts and professional tasters, who follow standardised, systematic procedures that aim 115 to provide a rigorous assessment of wines. These experts and tasters are assumed to be independent 116 of wineries, thus helping consumers make informed purchase decisions, as the learning process 117 necessary for consumers to become wine experts themselves takes time [13]. 118 Research has highlighted the effect of wine expert recommendations from a marketing perspective. 119 Parsons and Thompson [14] showed that consumers attribute high credibility to independent wine 120 expert recommendations. Friberg and Grönqvist [15] found a significant effect of positive reviews by 121 experts on the sales of the wines they had tasted. The scores that wines receive in these guides can 122 also influence other marketing variables. A line of research has focused on the effect of expert reviews 123 on wine prices [16]. For instance, studies have shown a positive effect of this type of evaluation on 124 product prices, associated with a greater product reputation [7, 17]. Ashenfelter and Jones [18] 125 showed that the influence of expert ratings on the price of wine is even greater than that of other 126 factors such as terroir conditions or climate, which are commonly used to predict wine prices [19]. 127 Wine research has also used the sensory reviews of experts in wine guides to measure wine quality 128 and brand reputation [20]. Dressler [21] analysed the reputation of German wineries, individually and 129 collectively, using three wine guides (Feinschmecker, Gault Millau and Eichelmann) and found 130 consistent judgements across all three. Focused on Sicilian wines, Roma et al. [9] used experts’ scores 131 in wine guides as a proxy of firm (wine) reputation. This approach is common in the wine literature 132 [22]. However, despite this evidence, the impact of a positive expert review on the price of a wine 133 may depend not only on the reputation of the wine itself but also on the reputation of the expert [23, 134 24] because not all experts or guides have the same reputation and prestige [25]. 135 136 2.2. The expert-consistency effect 137 According to dual-process theory [26], individuals’ opinions and even behaviours are based on 138 informational and normative influences such as those from expert reviews [27–29]. Information has 139 WEP – Wine Economics and Policy Just Accepted Manuscript 5 a greater impact on the receiver if the sender is perceived as credible. Expert information is believed 140 to be more credible and accurate (i.e. consistent) than non-expert information [30, 31]. 141 In the context of wine, it is difficult to identify the factors that each expert considers when making 142 judgements and rating wines because there is no common frame of reference across guides [16, 32]. 143 An expert’s rating is not necessarily an objective indicator of the quality of a wine because experts 144 make judgements based on their own personal preferences. Thus, when an expert gives a high rating 145 to a certain wine, it is not intended to convey the idea that the wine is of a higher quality than another 146 wine with a lower rating. This lack of comparability arises because ratings of wines are conditioned 147 by several factors such as origin, vintage, winery, price and even the expectations of the expert. 148 Therefore, a higher score for one wine than for another simply indicates an expert’s greater preference 149 for that wine. 150 Consequently, despite their alleged objectivity (as stated in wine guides), expert reviews cannot be 151 considered absolute objective assessments of wine quality. For instance, they may be biased by 152 experts’ personal preferences [33]. Evidence regarding the consistency of expert judgements is 153 somewhat mixed. Some authors have found consistency between different experts’ reviews of the 154 same wine (e.g. [34]). However, other authors have expressed concern about inconsistencies between 155 different experts’ opinions of wine quality and even inconsistencies in reviews by the same expert 156 over time (e.g. [35–37]). Cao and Stokes [38] reported that personal bias in wine expert reviews 157 translates into different ratings, discriminatory capacity and variability in the ratings of different 158 wines. Likewise, Ashton [35, 39] observed that wine guides focus on a few wines and cannot be 159 considered fair representations of the entire market, noting that even the number of tasters used to 160 issue a rating can influence the rating. These guides continue to be highly important in many markets 161 and are used as a reference by consumers around the world. Therefore, further investigation of the 162 effects of expert consistency/inconsistency is warranted. 163 164 2.3. Sentiment analysis: a tool for analysing the consistency of expert reviews 165 In recent years, natural language processing research techniques have allowed researchers to perform 166 textual and sentiment analysis of reviews by both experts and consumers (e.g. [40–46]). Sentiment 167 analysis is a subfield within natural language processing techniques that focuses on automatically 168 classifying a text through its valence [47]. It enables the extraction of information on opinions about 169 a subject (from users or experts) for a certain product [48, 49]. Previous research has shown that this 170 type of analysis based on the characteristics of the product can provide more precise information than 171 a general analysis of the overall (numerical) assessment [50]. Recent literature reviews have 172 WEP – Wine Economics and Policy Just Accepted Manuscript 6 highlighted the importance and uniqueness of sentiment analysis in marketing research [51] and in 173 hospitality and tourism [52]. 174 In the context of wine guides, users typically find two ratings or judgements of a given wine. The 175 first is a numerical score, usually on a scale of 0 to 100 points or 0 to 20 points, depending on the 176 guide. Some guides only publish wines that receive a minimum score of 80 or 85 points. The second 177 rating is a qualitative review based on tasting notes for the wine. These tasting notes consist of a brief 178 literal description of the sensory and organoleptic qualities of the wine [53]. Although numerical 179 scores are easily interpretable, the natural limitations of language hinder and complicate the task of 180 using words to convey what a wine is really like and to describe the sensations that the expert wants 181 to convey [54]. Sometimes, the sensory characteristics of wines are so special or unusual that there 182 may not be the right words to describe it. Furthermore, some authors suggest that the language of 183 professional tasting, which is used to describe the sensory properties of a wine, is based on jargon 184 and vocabulary that is so complex and difficult to decipher that only the experts themselves or the 185 most experienced consumers can understand it. In fact, Peynaud and Blouin [55] found that for 186 professional tasting notes to be effective, consumers must have a high level of understanding about 187 tasting, which is not always the case. Sometimes, these tasting notes may be pretentious, offering 188 little informational validity for consumers [56]. 189 Therefore, sentiment analysis based on each of the characteristics considered in the tasting notes 190 could offer a broader and more accurate illustration of how experts review a wine. From an analytical 191 perspective, the opinions of experts require analysis at the sentence level [57]. This sentence-level 192 focus is necessary because experts who review wines consider different characteristics or attributes 193 and generally have a different opinion on each of these aspects. Although many sentiment analysis 194 tools can easily divide comments into negative, positive or neutral, a textual review of a given wine 195 may contain phrases with different polarities because experts may have different feelings about each 196 characteristic of the wine. For instance, the standard tasting phases (i.e. sight, smell and taste) may 197 have different polarities, with some aspects being rated positively, others negatively and others 198 neutrally. In addition, there may be different degrees of positive or negative opinions. Accordingly, 199 reviews cannot be qualified simply as positive, negative or neutral. Instead, they include a series of 200 additive perceptions that create a nuanced rating and provide specific information on each of the 201 aspects evaluated by the expert. For instance, some characteristics of the wine (e.g. in the olfactory 202 phase of tasting) may be rated positively, whereas others (e.g. related to the palate) may be negatively 203 rated. 204 In sum, sentiment analysis techniques could lead to precise inference of the overall numerical score 205 for the wine. Therefore, these techniques are particularly useful for examining the opinions of experts 206 WEP – Wine Economics and Policy Just Accepted Manuscript 7 about the wines in a guide. Nguyen et al. [58] recently employed a similar approach, focusing on so-207 called online expert users. 208 209 3. Method 210 This study focuses on reviews by 19 professional wine tasters from the Wine Enthusiast guide 211 between 1999 and 2019. Wine Enthusiast Magazine is one of the most prestigious international 212 magazines in the sector, together with The Wine Advocate (Robert Parker). Each review included 213 qualitative tasting notes, in which the expert gave a judgement on the tasted wine, a quantitative score 214 of the wine (from 80 to 100 points), and some additional characteristics such as price, origin and 215 grape variety (see Figure 1). The wines were from 43 countries and their price ranged from 4 dollars 216 to 3,400 dollars. After the elimination of outliers and missing cases, the final sample contained 217 201,004 reviews. 218 219 FIGURE 1. Sample Wine Enthusiast guide review 220 221 Source: Wine Enthusiast website (2021). 222 223 The method had two stages. The first stage involved that quantitative ratings as well as qualitative 224 reviews were compared among the different experts in the guide. Reviews published in the guide 225 were made by 19 experts, as well as some other anonymous reviewers. Although the comparison of 226 quantitative ratings was straightforward, the comparison of qualitative reviews required prior analysis 227 of tasting notes using sentiment analysis. This analysis was carried out using the AFINN lexicon. 228 AFINN consists of 2,477 words in English that express a certain degree of positive or negative 229 sentiment. This corpus of words, produced by Finn Arup Nielsen between 2009 and 2011, contains a 230 rating for words ranging from −5 (most negative sentiment) to +5 (most positive sentiment). This 231 WEP – Wine Economics and Policy Just Accepted Manuscript 8 lexicon displays the information in two columns: the word next to its corresponding value (e.g. 232 “awesome” - 4 or “awful” -3). In this study, the sentiment value of the expert review was calculated 233 as the sum of the polarity of each of the words used in the review. In essence, each review was divided 234 into sentences, and each sentence into words. To evaluate one sentence of the review, each word was 235 assigned a value according to the AFINN lexicon. Adding up the values of all words in the sentence 236 gave an evaluation of that specific comment. Once this process had been performed for all sentences 237 in the review, the evaluations of each sentence or comment were summed to give an overall score for 238 the review. Because an expert review covers different aspects, different opinions can be found in the 239 same review. That is, the same review might contain both positive comments (e.g. regarding palate) 240 and negative comments (e.g. regarding nose). However, the additive procedure employed in this study 241 gave an overall evaluation of the intensity (value) and polarity (positive/negative) of the review based 242 on the evaluation of each comment in the review. Compared to the alternative of using the average 243 of the individual evaluations of each word, this additive procedure accounted for the length of the 244 review because there is evidence that longer reviews provide greater added value to the tasting note 245 of the wine [53]. In addition, it provided a broader ranking of the review than a simple classification 246 as positive, negative or neutral. 247 In the second stage, the wines were classified according to their characteristics using techniques based 248 on cluster analysis. The starting assumption was that wines in a given group were homogeneous but 249 different from the wines in other groups. Each wine was defined by a set of variables related to its 250 review (qualitative and quantitative), origin and grape. The objective of this stage was to group similar 251 wines by comparing specific vectors for the set of variables used in this study. An 𝑁 × 𝑑 matrix was 252 created for this analysis, where the columns were the variables, and the rows were the observations. 253 Each observation (i.e. row) was a vector of dimension 𝑑, denoted as 𝑥𝑖. The data set was denoted as 254 𝑥 = (𝑥𝑖)𝑖∈{1,⋯,𝑁}. Each observation had 𝑑𝑐𝑜𝑛𝑡 continuous variables in ℝ 𝑑𝑐𝑜𝑛𝑡 and 𝑑𝑐𝑎𝑡 categorical 255 variables, with {1,⋯,𝑚𝑗} levels for each nominal variable 𝑗. Hence, 𝑑𝑐𝑜𝑛𝑡 + 𝑑𝑐𝑎𝑡 = 𝑑. 256 To classify the observations into groups that could be interpreted in a meaningful way, an 257 unsupervised learning method was used. It was hypothesised that there existed hidden or latent 258 variables (unobserved random variables) for all data points in the data set that associated a specific 259 cluster to each observation. Thus, the latent variable model was a mixture model. 260 In a mixture model, 𝐾 distributions are mixed, and it is assumed that each observation belongs to one 261 of them. The latent variable 𝑧𝑖 for observation 𝑖 corresponds to one of the distributions in the mixture. 262 In other words, the latent variable 𝑧𝑖 is the cluster to which observation 𝑥𝑖 belongs. If the number of 263 clusters is 𝐾, then 𝑧𝑖 ∈ {1, ⋯,𝐾}, and the set of latent variables is denoted as 𝑧 = (𝑧𝑖)𝑖∈{1,⋯,𝑁}. In a 264 mixture model, the data generation process is assumed to be 𝑝(𝑧, 𝑥) = 𝑝(𝑧𝑖)𝑝(𝑥𝑖|𝑧𝑖 = 𝑘). Here, 265 WEP – Wine Economics and Policy Just Accepted Manuscript 9 𝑝(𝑧𝑖) is a multinomial distribution, where 𝜂𝑘 = 𝑃𝑟(𝑧𝑖 = 𝑘) is the probability that observation 𝑖 266 belongs to cluster 𝑘. The set of probabilities 𝜂 = (𝜂𝑘)𝑘∈{1,⋯,𝐾} are referred to as the mixing weights. 267 Furthermore, 𝜙𝑘(𝑥𝑖|𝜃𝑘) = 𝑝(𝑥𝑖|𝑧𝑖 = 𝑘) is the probability distribution of the data in cluster 𝑘, and 𝜃𝑘 268 are the parameters of this distribution. The probability density function is given as follows: 269 𝑓(𝑥𝑖|𝜃) = ∑ 𝜂𝑘𝜙𝑘(𝑥𝑖|𝜃𝑘) 𝐾 𝑘=1 270 where 𝜃 = (𝜃𝑘)𝑘∈{1,⋯,𝐾} is the set of all parameters for the distributions in the mixture, including the 271 mixing weights. 272 For continuous variables, the cluster distributions were multivariate Gaussian distributions 273 𝜙𝑘(𝑥𝑖|𝜃𝑘) = 𝑁(𝑥𝑖|𝜇𝑘,Σ𝑘), where the parameters of the distribution 𝑘, 𝜃𝑘 = {𝜇𝑘,Σ𝑘} were the mean 274 vector 𝜇𝑘 and covariance matrix Σ𝑘. Categorical variables were assumed to be independent 275 multivariate multinomial variables distributed conditional on the latent variable. Therefore, 276 𝜙𝑘(𝑥𝑖|𝜃𝑘) = ℳ(𝑥𝑖|𝛼𝑘) for 𝛼𝑘 = (𝛼𝑗𝑘)𝑗∈{1,⋯,𝑑𝑐𝑎𝑡} , where 𝛼𝑗𝑘 is the vector of parameters (event 277 probabilities) for the multinomial distribution associated with variable 𝑗 in cluster 𝑘, and its 278 dimension is 𝑚𝑗. 279 For the estimation of the parameters, the R package Rmixmod version 2.1.5 was used. This package 280 maximises the log-likelihood with an expectation maximisation (EM) algorithm as follows: 281 ℒ(Θ) = ∑ ln𝑓(𝑥𝑖|𝜃) 𝑁 𝑖=1 282 for Θ = {𝜂,𝜃}, the set of all parameters of the mixture. 283 Once the wines had been classified into similar groups, the differences between the expert reviews of 284 the wines belonging to each cluster were analysed. The data processing and estimation was carried 285 out in MATLAB. 286 287 4. Results 288 In the first stage, the quantitative and qualitative expert reviews in the guide were compared. The 289 average score of the tasted wines was 88.81 points (SD = 3.03), with a minimum of 80 points and a 290 maximum of 100. The experts used an average of 40.56 words in their descriptions of wines (SD = 291 11.28), with a minimum of three words and a maximum of 135. The average sentiment score was 3.2 292 points (SD = 7.02), with a minimum of -33 points and a maximum of 41. The average price was 36.62 293 dollars (SD = 43.17), with a minimum of 4 dollars and a maximum of 3,400 dollars. 294 Table I presents the average quantitative and sentiment ratings for each expert. It also shows the 295 average number of words used by each expert in the tasting notes. There are statistically significant 296 WEP – Wine Economics and Policy Just Accepted Manuscript 10 differences between the experts’ quantitative ratings. There are also differences in the nuances 297 provided in the tasting notes, as reflected by the differences in the number of words used and the 298 sentiment ratings for the experts. 299 300 TABLE I. Ratings of wines according to experts 301 Expert No. of wines tasted Average quantitative score Average of sentiment rating Average number of words Alexander Peartree 1,637 87.14 -1.28 41.26 Anna Lee C. Iijima 8,061 89.37 0.83 41.38 Anne Krebiehl MW 7,661 91.02 5.27 47.17 Carrie Dykes 268 86.45 1.10 42.75 Christina Pickard 2,349 88.97 1.72 57.00 Fiona Adams 408 86.72 -3.91 49.77 Jeff Jenssen 783 88.08 -1.39 35.75 Jim Gordon 9,083 88.71 4.71 38.12 Joe Czerwinski 5,842 88.66 0.24 40.96 Kerin O’Keefe 20,055 89.12 -1.88 38.03 Lauren Buzzeo 2,886 88.00 3.18 50.53 Matt Kettmann 13,910 90.21 -0.43 44.40 Michael Schachner 20,004 86.99 0.28 42.42 Mike DeSimone 956 89.07 -0.44 43.21 Paul Gregutt 13,824 89.34 4.61 43.48 Roger Voss 40,124 88.90 8.58 37.47 Sean P. Sullivan 9,197 88.67 1.74 38.39 Susan Kostrzewa 1,170 86.89 6.03 39.71 Virginie Boone 17,578 89.67 2.75 38.71 Nameless 25,208 87.81 4.10 38.96 Total 201,004 88.81 3.20 40.55 F 1158.84 (p < 0.000) 3534.31 (p < 0.000) 1351.94 (p < 0.000) Source: Authors 302 303 In the second stage, the wines were classified according to their characteristics using techniques based 304 on cluster analysis. The proposed model was estimated for K = 2, . . . ,7 clusters in relation to the 305 wines appearing in this guide. To identify the clusters, four variables were used: the quantitative 306 rating, sentiment score of the tasting note, country of origin of the wine and grape variety. The model 307 selection criterion was the Bayesian information criterion (BIC; [59] Schwarz 1978). This criterion 308 suggested that K = 4 was the number of groups that best fit the data (see Table II). External validation 309 is also desirable to confirm the usefulness of the cluster solution. External validation consisted of 310 examining whether there were also intercluster differences in variables other than those used to 311 WEP – Wine Economics and Policy Just Accepted Manuscript 11 classify the wines. This external validation served as an exploratory investigation of the influence of 312 the cluster structure and main characteristics [60]. To this end, the price variable was also examined 313 (see Table II). 314 315 TABLE II. Descriptive analysis of clusters with mean and standard deviation (in parentheses) 316 Variables used in the cluster analysis External validation Quantitative rating Qualitative (sentiment) rating Main country origin Main grape variety Price Best quality N = 56,043 90.09 (2.77) 10.26 (5.74) France Red & White 41.50 (65.01) Affordable N = 48,321 85.29 (1.74) 1.33 (4.47) America, France and Spain Red & White 21.10 (16.40) Over-priced N = 67,789 90.00 (2.23) 0.08 (5.75) United States and Italy Red 47.24 (37.42) Smart choice N = 28,851 89.41 (2.15) -0.02 (5.65) United States White 28.80 (25.02) TOTAL N = 201,004 88.81 (3.03) 3.21 (7.02) N.A. N.A. 36.62 (43.16) Source: Authors 317 318 The empirical findings reveal some interesting differences between the clusters. The first group, “top-319 of-the-range wines (best quality)”, consists of wines with a well-above-average rating based on both 320 sentiment and quantitative ratings. These wines are also on average more expensive. It consists of red 321 and white wines, mainly from France. The second group, “low-price wines (affordable/low cost)”, 322 consists of wines with a below-average quantitative score but with a slightly positive sentiment rating. 323 The average price of wines in this group is well below the average for the entire sample. This group 324 includes white and red wines from North and South America, France and Spain. The third group, 325 “overpriced wines”, consists of wines with a neutral sentiment rating but a roughly average 326 quantitative score. These wines’ average price is well above the average for the entire sample. They 327 are mostly red wines from the United States and Italy. Finally, the fourth group, “best-value wines 328 (smart choice)”, consists of wines with a roughly average quantitative score and a below-average 329 qualitative rating. They also have a lower-than-average price. This group mainly consists of white 330 wines from the United States. 331 The differences between the four groups were significant for the four variables considered in the 332 analysis. In addition, for the external validation of the four clusters, ANOVA was used to test whether 333 the prices differed between clusters. The price variable (4064.87; < 0.0001) was significantly different 334 between clusters, thereby externally validating the classification presented in this research. 335 WEP – Wine Economics and Policy Just Accepted Manuscript 12 Once the wines had been classified into homogeneous groups, the average sentiment evaluations of 336 the tasters were calculated for each group. The results indicate that the differences between the 337 experts’ reviews differ significantly, which shows that there are no standard criteria for reviewing the 338 wines in the guide (see Table III). This result reinforces the earlier idea (see Table I) that tasting notes 339 might differ amongst wine experts, even when the tasted wines are similar and receive a comparable 340 quantitative rating. 341 342 TABLE III. Test of differences of experts’ sentiment ratings 343 F p value Group 1. Best quality 382.65 p < 0.001 Group 2. Affordable 110.97 p < 0.001 Group 3. Overpriced 295.44 p < 0.001 Group 4. Smart choice 151.12 p < 0.001 Source: Authors 344 345 5. Conclusions 346 Wine guides written by professional and expert tasters are widely used in the wine industry to market 347 wine, providing important information signals for consumers around the world. However, despite the 348 importance of these guides, some authors have expressed doubts about the consistency of the scores 349 and reviews they provide. The objective of this study was to analyse the internal consistency of the 350 scores and reviews of the experts and professional tasters writing for a specific guide. The method 351 included sentiment analysis of the tasting notes and a novel clustering technique that identified groups 352 of wines with similar characteristics. 353 The results show considerable divergence between the qualitative and quantitative assessments by 354 professional tasters in the Wine Enthusiast wine guide. Although most consumers trust the guide to 355 reduce their information asymmetries with respect to winemakers, disparity in the criteria used by the 356 guide’s experts raises doubts over its effectiveness as a source of reliable, verified, standardised 357 information for consumers. In fact, even when wines are grouped according to their characteristics, 358 there are still discrepancies amongst experts. Therefore, it cannot be said that the guide follows a 359 single, uniform set of criteria for its wine reviews. 360 These results have managerial implications for the wine sector. First, the results have implications 361 for wineries whose wines are tasted by experts writing for this guide. These wineries should be aware 362 that experts’ personal preferences may affect their judgements. Hence, knowing the personal tastes 363 and background of each expert could help wineries improve the ratings of their wines. Second, these 364 results are important for the management of the guide itself. The reputation and prestige of a particular 365 guide is the basis of consumers’ trust in that guide, which is considered a reliable and independent 366 WEP – Wine Economics and Policy Just Accepted Manuscript 13 source of information. If the reviews in the guide are inconsistent and the experts do not reach a 367 consensus when rating wines, doubts may arise about the reliability of these reviews, depending on 368 which expert tasted the wine. These doubts could ultimately affect the publication’s reputation. 369 Finally, regarding the limitations of this study, only one guide (Wine Enthusiast) was analysed. It is 370 not possible to extrapolate these results to other specialist publications within the sector. Furthermore, 371 the sentiment analysis was carried out using a specific lexicon. Although this lexicon has been widely 372 used in academic studies, it is not the only available alternative, nor is it specific to the wine sector. 373 These limitations open new research opportunities that should be addressed in the future. Future 374 research could also explore the effect of reviewer expertise in the context of wine guides. Reviewer 375 expertise has already been shown to influence reviewer ratings in the context of hotel and restaurant 376 review platforms [58]. Finally, future research could extend this analysis to other markets where 377 guides based on expert reviews are also common. Examples include the film and television industry, 378 where sentiment analysis techniques have already been used to study expert and consumer opinions 379 [2] but not to study specialised guides (e.g. Rotten Tomatoes). 380 381 References 382 [1] L. Fan, X. Zhang, and L. Rai, When should star power and eWOM be responsible for the box 383 office performance? - An empirical study based on signaling theory, J. Retail. Consum. Serv., 62, 384 102591, 2021, https://doi.org/10.1016/j.jretconser.2021.102591. 385 [2] R. Niraj and J. Singh, Impact of User-Generated and Professional Critics Reviews on 386 Bollywood Movie Success, Australas. Mark. J., 23(3), 179–187, 2015, 387 https://doi.org/10.1016/j.ausmj.2015.02.001. 388 [3] E. Oczkowski, Hedonic Wine Price Functions and Measurement Error, Econ. Rec., 77(239), 389 374–382, 2001, https://doi.org/10.1111/1475-4932.00030. 390 [4] S. Petropoulos, C. S. Karavas, A. T. Balafoutis, I. Paraskevopoulos, S. Kallithraka, and Y. 391 Kotseridis, Fuzzy logic tool for wine quality classification, Comput. Electron. Agric., 142, 552–562, 392 2017, https://doi.org/10.1016/j.compag.2017.11.015. 393 [5] G. A. Akerlof, The Market for ‘Lemons’: Quality Uncertainty and the Market Mechanism, Q. 394 J. Econ., 84(3), 488–500, 1970, https://doi.org/10.2307/1879431. 395 [6] U. R. Orth and P. Krška, Quality signals in wine marketing: the role of exhibition awards, Int. 396 Food Agribus. Manag. Rev., 4(4), 385–397, 2001. 397 [7] R. Sellers-Rubio, F. Mas-Ruiz, and F. Sancho-Esper, Firm reputation, advertising investment, 398 and price premium: The role of collective brand membership in high-quality wines, Agribusiness, 399 34(2), 351–362, 2018, https://doi.org/10.1002/agr.21526. 400 WEP – Wine Economics and Policy Just Accepted Manuscript 14 [8] T. Spawton, Marketing planning for wine, Int. J. Wine Mark., 2(2), 2–49, 1990. 401 [9] P. Roma, G. Di Martino, and G. Perrone, What to show on the wine labels: a hedonic analysis 402 of price drivers of Sicilian wines, Appl. Econ., 45(19), 2765–2778, 2013, 403 https://doi.org/10.1080/00036846.2012.678983. 404 [10] K. Floyd, R. Freling, S. Alhoqail, H. Y. Cho, and T. Freling, How Online Product Reviews 405 Affect Retail Sales: A Meta-analysis, J. Retail., 90(2), 217–232, 2014, 406 https://doi.org/10.1016/j.jretai.2014.04.004. 407 [11] N. Hu, L. Liu, and J. J. Zhang, Do online reviews affect product sales? The role of reviewer 408 characteristics and temporal effects, Inf. Technol. Manag., 9(3), 201–214, 2008, 409 https://doi.org/10.1007/s10799-008-0041-2. 410 [12] A. J. Blair, C. Atanasova, L. Pitt, A. Chan, and Å. Wallstrom, Assessing brand equity in the 411 luxury wine market by exploiting tastemaker scores, J. Prod. Brand Manag., 26(5), 447–452, 2017, 412 https://doi.org/10.1108/JPBM-06-2016-1214. 413 [13] K. A. Latour and J. A. Deighton, Learning to Become a Taste Expert, J. Consum. Res., 46(1), 414 1–19, 2019, https://doi.org/10.1093/jcr/ucy054. 415 [14] A. G. Parsons and A. Thompson, Wine recommendations: who do I believe?, Br. Food J., 416 111, 9, 1003–1015, 2009, https://doi.org/10.1108/00070700910992899. 417 [15] R. Friberg and E. Grönqvist, Do Expert Reviews Affect the Demand for Wine?, Am. Econ. J. 418 Appl. Econ., 4(1), 193–211, 2012, https://doi.org/10.1257/app.4.1.193. 419 [16] A. Albright, P. Pedroni, and S. Sheppard, Uncorking Expert Reviews with Social Media: A 420 Case Study Served with Wine, p. 19, 2018. 421 [17] S. González-Hernando, V. Iglesias-Argüelles, and C. González-Mieres, ¿Cómo afectan las 422 prescripciones de terceras partes a las evaluaciones del consumidor y a la prima de precio del 423 producto?, Catedra Fundación Ramón Areces de Distribución Comercial, 1304, 2013. Accessed: Apr. 424 08, 2021. [Online]. Available: https://ideas.repec.org/p/ovr/docfra/1304.html 425 [18] O. Ashenfelter and G. V. Jones, The Demand for Expert Opinion: Bordeaux Wine*, J. Wine 426 Econ., 8(3), 285–293, 2013, https://doi.org/10.1017/jwe.2013.22. 427 [19] O. Ashenfelter, D. Ashmore, and R. Lalonde, Bordeaux Wine Vintage Quality and the 428 Weather, CHANCE, 8, 4, 7–14, 1995, https://doi.org/10.1080/09332480.1995.10542468. 429 [20] B. A. Benjamin and J. M. Podolny, Status, quality, and social order in the California wine 430 industry, Adm. Sci. Q., 44(3), 563–589, 1999. 431 [21] M. Dressler, Strategic winery reputation management – exploring German wine guides, Int. 432 J. Wine Bus. Res., 28(1), 4–21, 2016, https://doi.org/10.1108/IJWBR-10-2014-0046. 433 WEP – Wine Economics and Policy Just Accepted Manuscript 15 [22] L. Benfratello, M. Piacenza, and S. Sacchetto, Taste or reputation: what drives market prices 434 in the wine industry? Estimation of a hedonic model for Italian premium wines, Appl. Econ., 41(17), 435 2197–2209, 2009, https://doi.org/10.1080/00036840701222439. 436 [23] H. H. Ali and C. Nauges, The Pricing of Experience Goods: The Example of en primeur Wine, 437 Am. J. Agric. Econ., 89(1), 91–103, 2007. 438 [24] D. A. Reinstein and C. M. Snyder, The Influence of Expert Reviews on Consumer Demand 439 for Experience Goods: A Case Study of Movie Critics*, J. Ind. Econ., 53(1), 27–51, 2005, 440 https://doi.org/https://DOI.org/10.1111/j.0022-1821.2005.00244.x. 441 [25] V. Odorici and R. Corrado, Between Supply and Demand: Intermediaries, Social Networks 442 and the Construction of Quality in the Italian Wine Industry, J. Manag. Gov., 8(2), 149–171, 2004, 443 https://doi.org/10.1023/B:MAGO.0000026542.18647.48. 444 [26] M. Deutsch and H. B. Gerard, A study of normative and informational social influences upon 445 individual judgment, J. Abnorm. Soc. Psychol., 51, 629–636, 1955, 446 https://doi.org/10.1037/h0046408. 447 [27] M. Y. Cheung, C. Luo, C. L. Sia, and H. Chen, Credibility of Electronic Word-of-Mouth: 448 Informational and Normative Determinants of On-line Consumer Recommendations, Int. J. Electron. 449 Commer., 13(4), 9–38, 2009, https://doi.org/10.2753/JEC1086-4415130402. 450 [28] A. Naujoks and M. Benkenstein, Who is behind the message? The power of expert reviews 451 on eWOM platforms, Electron. Commer. Res. Appl., 44, 101015, 2020, 452 https://doi.org/10.1016/j.elerap.2020.101015. 453 [29] P. Racherla and W. Friske, Perceived ‘usefulness’ of online consumer reviews: An 454 exploratory investigation across three services categories, Electron. Commer. Res. Appl., 11(6), 548–455 559, 2012, https://doi.org/10.1016/j.elerap.2012.06.003. 456 [30] A. S. Lo and S. S. Yao, What makes hotel online reviews credible? An investigation of the 457 roles of reviewer expertise, review rating consistency and review valence, Int. J. Contemp. Hosp. 458 Manag., 31(1), 41–60, 2019, https://doi.org/10.1108/IJCHM-10-2017-0671. 459 [31] S. Quaschning, M. Pandelaere, and I. Vermeir, When Consistency Matters: The Effect of 460 Valence Consistency on Review Helpfulness, J. Comput.-Mediat. Commun., 20(2), 136–152, Mar. 461 2015, https://doi.org/10.1111/jcc4.12106. 462 [32] I. Olkin, Y. Lou, L. Stokes, and J. Cao, Analyses of Wine-Tasting Data: A Tutorial, J. Wine 463 Econ., 10(1), 4–30, 2015, https://doi.org/10.1017/jwe.2014.26. 464 [33] S. Castriota, D. Curzi, and M. Delmastro, Tasters’ bias in wine guides’ quality evaluations, 465 Appl. Econ. Lett., 20(12), 1174–1177, 2013. 466 WEP – Wine Economics and Policy Just Accepted Manuscript 16 [34] E. T. Stuen, J. R. Miller, and R. W. Stone, An Analysis of Wine Critic Consensus: A Study 467 of Washington and California Wines, J. Wine Econ., 10(1), 47–61, 2015, 468 https://doi.org/10.1017/jwe.2015.3. 469 [35] R. H. Ashton, Reliability and consensus of experienced wine judges: Expertise within and 470 between, J. Wine Econ., 7(1), 70–87, 2012. 471 [36] R. H. Ashton, Is there consensus among wine quality ratings of prominent critics? An 472 empirical analysis of red Bordeaux, 2004-2010, J. Wine Econ., 8(2), 225, 2013. 473 [37] R. T. Hodgson, An examination of judge reliability at a major US wine competition, J. Wine 474 Econ., 3(2), 105–113, 2008. 475 [38] J. Cao and L. Stokes, Evaluation of Wine Judge Performance through Three Characteristics: 476 Bias, Discrimination, and Variation*, J. Wine Econ., 5(1), 132–142, 2010, 477 https://doi.org/10.1017/S1931436100001413. 478 [39] R. H. Ashton, Improving Experts’ Wine Quality Judgments: Two Heads Are Better than 479 One*, J. Wine Econ., 6(2), 160–178, 2011, https://doi.org/10.1017/S1931436100001577. 480 [40] F. Caviggioli, L. Lamberti, P. Landoni, and P. Meola, Technology adoption news and 481 corporate reputation: sentiment analysis about the introduction of Bitcoin, J. Prod. Brand Manag., 482 29(7), 877–897, 2020, https://doi.org/10.1108/JPBM-03-2018-1774. 483 [41] B. Chen, V. Velchev, J. Palmer, and T. Atkison, Wineinformatics: A Quantitative Analysis of 484 Wine Reviewers, Fermentation, 4(4), 82, 2018, https://doi.org/10.3390/fermentation4040082. 485 [42] N. Kotonya, P. De Cristofaro, and E. De Cristofaro, Of Wines and Reviews: Measuring and 486 Modeling the Vivino Wine Social Network, in 2018 IEEE/ACM International Conference on 487 Advances in Social Networks Analysis and Mining (ASONAM), Barcelona, 2018, 387–392. 488 https://doi.org/10.1109/ASONAM.2018.8508776. 489 [43] E. Lefever, I. Hendrickx, and A. van den Bosch, Very quaffable and great fun: applying NLP 490 to wine reviews, in Computational Linguistics in the Netherlands, 2015. Accessed: Apr. 08, 2021. 491 [Online]. Available: http://hdl.handle.net/1854/LU-7198586 492 [44] E. Lefever, I. Hendrickx, and I. Croijmans, Discovering the Language of Wine Reviews: A 493 Text Mining Account, Proc. Elev. Int. Conf. Lang. Resour. Eval. LREC 2018, 3297-3302, 2018, 494 Accessed: Apr. 08, 2021. [Online]. Available: https://eprints.whiterose.ac.uk/137244/p. 6, 2018. 495 [45] S. Moon and W. A. Kamakura, A picture is worth a thousand words: Translating product 496 reviews into a product positioning map, Int. J. Res. Mark., 34(1), 265–285, 2017, 497 https://doi.org/10.1016/j.ijresmar.2016.05.007. 498 [46] G. J. Thebaud, P. Kraus, and M. Mondaresnezhad, Applying Text Data Analytics Techniques 499 to Wine Reviews, presented at the AMCIS 2020 TREOs, 2020, 53, 2. 500 WEP – Wine Economics and Policy Just Accepted Manuscript 17 [47] B. Pang and L. Lee, Opinion mining and sentiment analysis, Found. Trends Inf. Retrieva, 2(1–501 2), 1–135, 2008. 502 [48] M. Joshi, P. Prajapati, A. Shaikh, and V. Vala, A survey on Sentiment Analysis, Int. J. 503 Comput. Appl., 163(6), 34–38, 2017, https://doi.org/10.5120/ijca2017913552. 504 [49] X. Wang, T. Zhou, X. Wang, and Y. Fang, Harshness-aware sentiment mining framework for 505 product review, Expert Syst. Appl., 187, 115887, 2022, https://doi.org/10.1016/j.eswa.2021.115887. 506 [50] A. Gandomi and M. Haider, Beyond the hype: Big data concepts, methods, and analytics, Int. 507 J. Inf. Manag., 35(2), 137–144, 2015, https://doi.org/10.1016/j.ijinfomgt.2014.10.007. 508 [51] M. Rambocas and B. G. Pacheco, Online sentiment analysis in marketing research: a review, 509 J. Res. Interact. Mark., 12(2), 146–163, 2018, https://doi.org/10.1108/JRIM-05-2017-0030. 510 [52] F. Mehraliyev, I. C. C. Chan, and A. P. Kirilenko, Sentiment analysis in hospitality and 511 tourism: a thematic and methodological review, Int. J. Contemp. Hosp. Manag., 34(1), 46–77, 2021, 512 https://doi.org/10.1108/IJCHM-02-2021-0132. 513 [53] C. D. Ramirez, Do Tasting Notes Add Value? Evidence from Napa Wines, J. Wine Econ., 514 5(1), 143–163, 2010. 515 [54] K. Lehrer and A. Lehrer, Winespeak or critical communication? Why people talk about wine, 516 2008. 517 [55] E. Peynaud and J. Blouin, The Taste of Wine: The Art Science of Wine Appreciation. John 518 Wiley & Sons, 1996. 519 [56] R. E. Quandt, On Wine Bullshit: Some New Software?, J. Wine Econ., 2(2), 129–135, 2007, 520 https://doi.org/10.1017/S1931436100000389. 521 [57] R. Feldman, Techniques and applications for sentiment analysis, Commun. ACM, 56, 4, 82–522 89, Apr. 2013, https://doi.org/10.1145/2436256.2436274. 523 [58] P. Nguyen, X. (Shane) Wang, X. Li, and J. Cotte, Reviewing Experts’ Restraint from 524 Extremes and Its Impact on Service Providers, J. Consum. Res., 47(5), 654–674, 2021, 525 https://doi.org/10.1093/jcr/ucaa037. 526 [59] G. Schwarz, Estimating the Dimension of a Model, Ann. Stat., 6(2), 1978, 527 https://doi.org/10.1214/aos/1176344136. 528 [60] R. P. Dant and G. T. Gundlach, The challenge of autonomy and dependence in franchised 529 channels of distribution, J. Bus. Ventur., 14(1), 35–67, 1999, https://doi.org/10.1016/S0883-530 9026(97)00096-7. 531 532 533