IJFS#1182_bozza Ital. J. Food Sci., vol. 30, 2018 - 775 PAPER INFLUENCING FACTORS ON CHINESE WINE CONSUMERS’ BEHAVIOR UNDER DIFFERENT PURCHASING MOTIVATIONS BASED ON A MULTI-CLASSIFICATION METHOD W. HUIRU1, Z. ZHIJIAN1, F. JIANYING2, T. DONG2 and M. WEISONG*2,3 1College of Science, China Agricultural University, No.17 Qinghuadonglu, Haidian district, Beijing, 100083, P.R. China 2College of Information and Electrical Engineering, China Agricultural University, 209# No.17 Qinghuadonglu, Haidian district, Beijing, 100083, P.R. China 3Key Laboratory of Viticulture and Enology, Ministry of Agriculture *Corresponding author: Tel.: +86 106273671717 E-mail address: wsmu@cau.edu.cn ABSTRACT This study investigates the importance rating of influencing factors in driving wine consumption under four specific situations, that is, gift, banquet, party, and self-drinking, and thus achieves consumer segmentation. The affecting factors containing wine quality and socio-demographic variables are measured on a national representative sample (N=609) in China. Lasso method is used to select the factors, and a binary classifier v-twin support vector machine (v-TSVM) is extended to a multi-classification case by using a “one-versus-one” approach, which predicts the purchasing behavior of consumers. The monthly income, occupation, and knowledge of a consumer toward wine, the origin of wine, the vintage, and advertisement, are critical factors in driving consumption. Wine color and packing emerge as leading factors when consumer purchase wine for gift and banquet. Promotion significantly contributes to wine price selection for banquet, party, and self-drinking. Results show that the importance ranking of determinants varies under different purchasing motivations. In addition, the recognition accuracy can be considerably increased with prior knowledge of the consumption purpose. The nonlinear classifier is recommended for application because this classifier performs better than the linear one. This paper offers a fresh perspective on wine consumption behavior in China by applying two machine learning methods to identify and quantify determinants in specific situations. The results significantly assist wine managers to provide informed decisions with regard to wine production and marketing. Keywords: wine motives, personal traits, wine price, influential factor, consumers’ purchasing behavior Ital. J. Food Sci., vol. 30, 2018 - 776 1. INTRODUCTION The Chinese wine market has been flourishing in recent years with the improvement of the living standards of people and influence of affluent western lifestyles. In 2015, China has produced and consumed 11.5 million and 16 million hectoliters of wine, respectively, and ranked sixth in the global wine production (OIV, 2016). The prediction of the purchase behavior of consumers has been recognized as a significant research topic over the past decades. The accurate prediction of purchasing behavior enables wine dealers to accurately locate consumers’ demands, formulate appropriate marketing strategies, and achieve consumer segmentation. A study on the influence of purchasing motivation will aid in understanding the progress of consumer decision-making. Previous studies have shown that a wine consumer exhibits different purchasing motivations under various consumption scenarios in a descriptive way (GERAGHTY and TORRES, 2009). The main motivations of Chinese wine consumers include health care, auxiliary dining, and social contact (LI, 2014). People perceive wine as a healthy and nutritional product that can be recommended for regular intake to prevent diseases because wine contains many kinds of organic acids, minerals, and vitamins (TANG, 2008; MU, et al., 2016). Consumers have different preferences for wine attributes, when they drink at home, drink with friends and give as gifts (QUESTER and SMART, 1998; LI, 2014; CHEN, 2014). Drinking with friends at parties is casual and relaxed, whereas the major function of wine is to please other people in a business banquet (HALL et al., 2001). Currently, an increasing number of Chinese aim to give red wine as a gift to display affection or enhance friendship, especially during festivals. Wine consumption is influenced by many interrelating factors, such as wine product properties, lifestyle and situations of an individual, and psychological factors of consumers (PICKERING and HAYES, 2017; SCHMITT, 1997). Various attributes, such as taste, color, aroma, brand, production, and label information, are found to be important aspects that determine wine choice (THORPE, 2009). LOCKSHIN et al. (2017) summarized several methods used in marketing in combination with sensory science techniques to understand the changing consumer preferences in China. The consumption behavior of Chinese are highly related to the educational background of consumers, wine-related activities, wine taste, country of origin, quality, and price (BALESTRINI and GAMBLE, 2006; CAMILLO, 2012). Most business models are based on a linear equation to estimate the weight of such factors when measuring the response of purchase intention to the contextual factors. The commonly used linear models are linear discriminant and logistic regression analyses (CULBERT et al., 2017; HONORÉ-CHEDOZEAU et al., 2017; LI, 2014; YORMIRZOEV, 2016). The prediction models for purchase behavior are over-concentrated and over-reliant on these linear models compared with other research fields. In addition, principal component analysis (PCA) is also combined with the linear models to reduce the dimensionality of factors (JOLLIFFE, 2002; CHANG, et al., 2015; TSOURGIANNIS et al., 2015). However, using PCA to extract the component feature may lose certain important information. The meaning of comprehensive evaluation function is unclear when the labels of load factor in the principal component are positive and negative; thus, this function is sensitive to the relative scaling of the original variables and has low variable interpretation. Moreover, we can collect additional consumer data information with the development of communication technologies. Analyses based on traditional linear models are insufficient in achieving the requirement of academics and practitioners (DAYKIN and MOFFATT, 2002; THONG and SOLGAARD, 2017). In recent decades, increasing machine learning approaches have emerged. The least absolute shrinkage and selection operator (Lasso) is recognized for its capability to exploit Ital. J. Food Sci., vol. 30, 2018 - 777 information from ordinary data and flexibility to capture different effects of explanatory variables (TIBSHIRANI, 1996). The Lasso method can continuously shrink certain coefficients to zero and automatically select a subset of variables. In addition, the Lasso method has better variable interpretability than other feature selection methods, such as principal component regression and least squares regression (TIAN et al., 2015). The support vector machine (SVM) has been considered an effective and promising binary classifier for its unique advantages (VAPNIK, 1995). The introduction of kernel function maps training variables into a high-dimensional space, thereby successfully solving the nonlinear SVM. Many variants of SVM have been proposed since then, and several binary SVMs have been successfully extended to multi-class scenarios by applying “one-versus- one” (OVO) and “one-versus-all” (OVA) strategies (TOMAR and AGARWAL, 2015; WANG and ZHOU, 2017). The SVMs have been widely applied in various aspects that range from disease diagnosis and bankruptcy prediction to consumption behavior prediction (e.g., electricity, health product, and building energy) (BAHAMONDE et al., 2007; GUO, 2013; KAVAKLIOGLU, 2011). This study aims to use two representative machine learning methods, that is, Lasso and OVO v-TSVM, to investigate the determinants on the wine price selection under free and four purpose-based choices, that is, gift, banquet, party, and self-drinking, so as to predict the price of wine purchased by a consumer and estimate the effects of major factors selected through the Lasso method simultaneously. 2. MATERIALS AND METHODS 2.1. Conceptual framework Numerous researches discipline including economics, marketing, psychology, and products, have a shared interest in consumers’ behavior. More and more researchers have increasingly concentrated on consumers’ attitudes, motivation, perceptions and preferences for wine. Previous studies show that the motivation for purchasing wine varies under different purchasing situations (BARREIRO et al., 2008). Moreover, GOODMAN (2009) found that previous tasting experience and opinion of other people significantly influence wine purchasing behavior. The knowledge of consumers toward wine positively and notably affects the wine purchasing behavior of these consumers (HUSSAIN et al., 2007). Consumers with higher production involvement are less sensitive to wine price, whereas consumers with lower production involvement focus more on price discounts (JAEGE et al., 2009). Furthermore, many researches have shown that consumers’ purchase choices are well related with age and education in wine consumption. Based on the previous studies and combining with characteristics of wine consumption, the factors affecting wine consumption were summarizes in Fig. 1. It covers a range of purchasing motivations, reference group factor, marking factors, wine quality factors, the knowledge level towards wine and characteristics of consumers. Ital. J. Food Sci., vol. 30, 2018 - 778 Figure 1. Conceptual framework of consumer’s purchasing behavior for wine. 2.2. Questionnaire The questionnaire of Chinese consumers’ decision making behavior towards wine (It is shown in the appendix) was designed which consisted of 30 questions. This questionnaire includes the following contents: (1) Questions regarding the purchasing behaviors of consumers (the frequency of purchasing and drinking). (2) Questions investigating the price of wine that consumers frequently purchase. The consumers selected seven kinds of wine price, that is, 1=“$0-7.5,” 2=“$7.6-15.1,” 3=“$15.2- 22.6,” 4=“$ 22.7-30.1,” 5=“$ 30.2-45.2,” 6=“$ 45.3-75.3,” and 7=“$75.4 and above.” Based on the literature review, four usually types of motivation (gift, banquet, party, and self- drinking) for wine consumption were extracted and described in the questionnaire. Besides, the consumers were asked to choose the price of wine that they purchase for the specific purpose; (3) Questions that belong to multi-item scales, which measure factors that influence consumer purchasing, such as influence of others, quality of wine, enterprise marketing factors, knowledge of consumers. This study investigates the 10 items of wine quality factors, namely, the origin of wine and vintage, effects, packing, brand, label information, color, aroma, taste, and awards. The enterprise marketing factors contains 4 items, i.e. advertisement, promotion, service and attitude of the salesperson, and store location and environment. The 16-item scale was collected using a 5-point Likert scale from 1=“Strongly disagree” to 5=“Strongly agree.” (4) Consumers’ socio-demographic characteristics: gender, age, marital status, monthly income, education background, and occupation. All the six features use the numbers “1, 2, 3, …” to assign the variable level from low to high. Ital. J. Food Sci., vol. 30, 2018 - 779 2.3. Survey Considering the sampling frame and economic development level in different regions, we hired and trained several undergraduate students from China Agricultural University to answer the survey. We realized that young people are the main force in wine consumption and many wine tasting groups are found on the Internet. The survey was conducted in 2016 and lasted for five months. A total of 1600 questionnaires were distributed in many provinces of China, and 995 questionnaires were returned. In the returned questionnaires, the respondents were instructed to evaluate the statement “In the past year, how often did you purchase wine?” The data were “cleaned” by removing responses of “Never bought wine.” Therefore, the respondents in this study are consumers who, on one occasion, purchased wine. Finally, 609 questionnaires were used for final analysis. 2.4. Methods The analysis of the data consisted of two steps. First, the Lasso method was conducted to select the determinants. In theory, the discrimination ability we can obtain is robust when we use considerable features. However, an excessive number of features may increase the learning speed and lead to “overfitting” problem. The accurate selection of features is a prerequisite for a high prediction accuracy. The Lasso method penalizes the regression coefficients with an L1 penalty, shrinking many of the features to zero. Any features with non-zero coefficients are “selected” through the Lasso method, which indicates that these selected features contribute most to the wine purchasing behavior of consumers. Second, the OVO Mv-TSVM method was used to predict the behavior of Chinese wine consumers. To the best of our knowledge, the v-TSVM (PENG, 2010) was initially proposed for binary problems. Owing to the K-class scenario, we use the ith class as the positive and jth as the negative to construct a binary v-TSVM classifier. The OVO Mv-TSVM method need to construct K(K−1)/2 binary v-TSVM classifiers. For a new testing point, we obtain the vote for each class and assign its label with a maximum vote. For the nonlinear case, we used the Gaussian kernel function Ker(xi ,x j )=e − xi −x j 2 /2r2 and grid research to find the optimal parameter. All algorithms were written and operated in MATLAB 2014a, and all statistical analyses were conducted using the SPSS version 20 and Microsoft Office Excel version 2013 software. 3. RESULTS The whole Cronbach’s of the questionnaire is 0.776, F=334.221, Sig=0.00, thereby indicating that the survey has a high internal consistency. The response rate of questionnaire is 62.19%. A majority of the respondents (63.71%) would purchase wine once or twice a year, and 81.94% would drink two or more bottles of wine in a year. The 609 samples were collected from 21 provinces, cities, and autonomous regions in China. We inquired the per capita monthly income of the above areas from the China Statistical Yearbook 2016, on which we calculated the global per capita monthly income as a standard, and the value is 780.06$. The provinces where the samples were collected are located in Eastern China, and most of these samples were relatively advanced in the Ital. J. Food Sci., vol. 30, 2018 - 780 economic area. A total of 9.69% participants would purchase wine as a gift, 21.18% for banquet, 30.05% for parties, and 39.08% for self-drinking. The results of wine price that the consumers purchased are listed in Table 1. Based on these samples, 65.51% would purchase wine in the price range of 7.6-30.1$ with free choice. The average price is 30.20$ (SD=0.83), with a 95% confidence interval of (28.58, 31.85). For the purpose of gift, 55.83% would select the wine price above 30.2$, and the average price is 40.64$ (SD=0.91). For the purpose of banquet, 64.20% would select the wine price in the range of 15.2-45.2$, and the average price is 30.71$ (SD=0.74). For the purpose of party, 64.86% would select the wine price in the range of 7.6-30.1$, and the average price is 27.97$ (SD=0.69). For the purpose of self-drinking, 65.19% would select the wine price in the range of 7.6-30.1$, and the average price is 28.11$ (SD=0.75). Table 1. Statistical results of consumer's purchased wine price. Wine price ($) Free-choice (%) Gift-based (%) Banquet-based (%) Party-based (%) Self-drinking-based (%) 0-7.5 2.63 1.15 1.64 3.28 4.43 7.6-15.1 22.99 12.15 17.24 20.69 22.33 15.2-22.6 20.85 12.15 21.02 21.35 22.50 22.7-30.1 21.67 18.72 23.15 22.82 20.36 30.2-45.2 12.32 19.70 20.03 19.05 14.45 45.3-75.3 9.85 17.41 10.84 8.87 10.84 Above 75.3 9.69 18.72 6.08 3.94 5.09 Mean* 30.20 40.64 30.71 27.97 28.11 SD.* 0.83 0.91 0.74 0.69 0.75 95%Confidence interval* (28.58, 31.85) (38.89, 21.82) (29.27, 32.17) (26.65, 29.35) (26.61, 29.59) Note: *are the results of 10000 times Bootstrap resampling results. The characteristics of the sample’s demographics are detailed in Table 2. The average age is 35.18 years (SD=0.42). The average monthly income is 774.67$ in 10000 times Bootstrap estimation, which is nearly the same as the standard 780.06$. The respondents are 52.71% male and 47.29% female; a total of 32.35% are single, and 67.65% are married. A majority of the respondents who attained a college degree were 76.52%, 18.56% are senior high or in a special school, and only 4.93% are in primary or junior high school. The respondents vary in careers, 8.21% are students, 2.30% are peasantry, 25.94% are freelance, 2.96% are unemployed or retired, 11.99% are staffs of state-owned companies, 13.30% are staffs of foreign or private enterprises, 15.60% work as party and government officers, 9.36% work in education and scientific research units, and 10.34% work in other fields. Inspired by FORLEO et al. (2017), we lists the associations of wine consumption prices with demographics in Table 3. It is obvious that monthly income and occupation are significant no matter in what purpose-based. There are about 10% high-income and 3~4% low- income consumers choose high-priced wine. Male and female showed differences in the wine purchasing for free-choice, gift-giving and banquet-based purpose. There are 17.73% male and 14.12% female consumers choose wine price above 30.2$. The gender difference is not obvious in party-based and self-drinking based wine purchasing. There are only 8% elder people (above 46 years) choose high-priced wine (above 30.2$), and the percentage increased to 12% for gifted purpose. The single consumer and married consumer acted Ital. J. Food Sci., vol. 30, 2018 - 781 different in wine price-choosing for party-based and self-drinking based purpose. Statistically significant differences between education and wine-price choosing for gifted and banquet-based purpose were identified. About 30% highly educated consumers choose high-priced wine, and only 1% consumers with Primary or Junior high school background chose high-priced wine. Fig. 2 illustrates the results of statistical affecting factors, where the mean of wine knowledge is the highest at 4.04, and the mean of advertisement is the lowest at 3.18. Table 2. Statistical features of respondents. Demographic characteristics Category Percentage Sample population(n) Gender Male 52.71 321 Female 47.29 288 Age 18-25 22.33 136 26-35 31.86 194 36-45 24.14 147 46-55 17.24 105 Above 55 4.43 27 Mean/SD.* 35.18 0.42 95%Confidence interval* (34.35, 36.01) Marital status Single 32.35 197 Married 67.65 412 Per capita monthly income ($) 0-301.2 13.46 82 301.3-451.8 14.29 87 451.9-753.0 32.35 197 753.1-1054.2 20.69 126 1054.3-1506.0 9.52 58 1506.1-2259.0 4.76 29 Above 2259.0 4.93 30 Mean/SD.* 774.67 21.81 95%Confidence interval* (732.09, 818.99) Educational background Primary or Junior high school 4.93 30 Senior high or Special school 18.56 113 Junior college or Undergraduate 62.73 382 Postgraduate and above 13.79 84 Job Students 8.21 50 Peasantry 2.30 14 Freelance 25.94 158 Unemployed/retired 2.96 18 Staffs of state-owned companies 11.99 73 Staffs of foreign or private enterprises 13.30 81 Party and government officers 15.60 95 Education and scientific research units 9.36 57 Else 10.34 63 Note: *The Bootstrap estimate was calculated as the mid-value of the range. Ital. J. Food Sci., vol. 30, 2018 - 782 Table 3. Association of wine consumption prices with demographics. Items-prices Gender Age Marital status Monthly income Education Occupation Free-choice 0.004* 0.111 0.097 0.000*** 0.095 0.001*** Gift-based 0.014* 0.021* 0.369 0.000*** 0.002** 0.000*** Banquet-based 0.017* 0.000*** 0.215 0.000*** 0.010** 0.000*** Party-based 0.160 0.000*** 0.012* 0.000*** 0.136 0.000*** Self-drinking based 0.230 0.001*** 0.024* 0.000*** 0.110 0.000*** Note: *0.01