SAJEMS NS Vol 5 (2002) No I 233 The Use of Neural Networks and Rule Induction for Customer Segmentation and Target Market ProfUing JZBloom Department of Business Management, University of Stellenbosch ABSTRACT Inadequate market segmentation and clustering problems could cause an enterprise to either miss a strategic marketing opportunity or not cash in on a tactical campaign. The need for in-depth knowledge of customer segments and to overcome the limitations of non-linear problems require a different approach. The objectives of the research are (l) to consider the use of self-organising feature (SOM) neural networks for segmenting tourist markets and (2) to assess the use of inducing decision trees to obtain rules for profiJing existing and classifying new respondents. The fmdings of the SOM neural network modelling indicate three defmitive natural clusters. The induction of rules from decision trees were used to obtain a broad indication of a segment profile on the basis of a rule set and also enables the segment classification of customers from follow-up surveys. JELM30 1 INTRODUCTION AND BACKGROUND) Marketing strategists often encounter the problem of how to segment and compile profiJes of an enterprise's existing customers. Market segmentation is a process of dividing a market into distinct groups of buyers who might require separate products or marketing mixes (Venugopal & Baets, 1994: 36). Segmentation is based on various consumer characteristics such as demographics, socio-economic factors. geographic location, and product related behavioural characteristics like purchasing and consumption behaviour and attitudes towards and preference for products and services (Dibb & Simkin, 1991: 5). Target marketing is a strategy that aims at grouping a major market into segments in order to target one or more of these segments or to develop products and marketing programmes tailored to each segment (Kotler, 2000: 256-58). R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 234 SAJEMS NS VolS (2002) No I Customer clustering and segmentation are two of the most important data mining methods used in marketing and customer relationship management (Saarenvirta, 1998: I). Behavioural clustering and segmentation help drive strategic marketing initiatives, while sub-segments based on demographic and lifestyle characteristics could also be determined and used for tactical marketing efforts. However, inadequate market segmentation and clustering together with a limited understanding of the characteristics ofa segment profile, could cause an enterprise to either miss a strategic marketing opportunity or not cash in on a tactical campaign. Market segmentation has not only developed as a tool to segment markets and identifY target markets, but could also be used at a higher level to obtain more in-depth knowledge of the segment characteristics and further assist an enterprise to understand the relationship with its customers. The need for in-depth knowledge of customer segments and to overcome the limitations of non-linear problems requires a different approach. For instance, neural network models based on artificial intelligence technologies, can be developed to create clusters based on combinations of natural characteristics present in a set of customer data (e.g. purchase history, demographic attributes, phsychographic characteristics, etc.). However, many of the neural network modeling applications are used to develop individual models for a specific research problem. The sequencing of modeling applications such as using the output of a neural network modeling application as an input or output for another artificial intelligence technique e.g. the induction of decision trees to obtain rules) also enhances the knowledge base of customers' behavioural characteristics. To illustrate the applications of a self-organising feature map (SOM) neural network for segmentation and the induction of decision trees to obtain rules, the data of a tourist' survey of domestic tourists to the Western Cape isused. The use of SOM neural networks and induction of decision trees in travel and tourism is not widespread. Applications of neural networks in the tourism industry refer to, among others, customer analysis and holiday package targeting (Ryman-Tubb, 1993) and forecasting tourist behaviour (Pattie & Snyder, 1996). There is an increased need, however, for tools and techniques which could provide further knowledge and understanding of dynamic tourist behaviour. It has become essential for national and provincial tourism organisations to extract knowledge from the data obtained from tourist surveys. The fmdings of tourist surveys to be mostly descriptive and provide little or no indication of behavioural segments, or the underlying profile of the tourist characteristics that form part ofa segment, or future segment classification. R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS Vol 5 (2002) No I 235 In the light of the above, the primary objectives of the research are (J) to consider the use of SOM neural networks for segmenting tourist markets and (2) to assess the use of inducing decision trees to obtain rules for profiling existing and classifYing new respondents by using the output provided by SOM neural networks. The article firstly considers the nature and scope of a SOM neural network for learning and grouping data and the induction of decision trees to obtain rules. A conceptual comparison is provided of Cluster Analysis and SOM neural networks, and the advantages and disadvantages of decision trees are also discussed. A tourism industry application of using SOM neural networks and the induction of decision trees to obtain rules, is provided by means of a discussion of the methodology related to the modeling process. The outcomes of the modeling process in terms of the ability of a SOM neural network to naturally group data and the use of induction to obtain rules from decision trees, are also discussed. A conclusion is provided in the final section. 2 NATURE AND SCOPE OF SOM NEURAL NETWORKS AND INDUCING RULES FROM DECISION TREES Artificial neural networks (ANN) have evolved as a technique to solve a variety of problems in the business field; from classification, grouping and forecasting, to portfolio optimisation, credit scoring and stock picking. Neural networks are described as information processing technology, which is inspired by the human brain and mimics its problem solving processes (Klimasauskas, 1996: 45). ANNs exhibit certain features such as the ability to learn complex patterns in a set of data and generalise the learned pattern (Venugopal & Baets, 1994: 30). There are a variety of neural network algorithms for solving complex business and marketing problems. The appropriate use of a learning algorithm depends primarily on the type of problem which needs to be modelled. A taxonomy of learning algorithms has been proposed in the literature (Lippman, 1989: 10). The distinction is primarily differences in the input format, i.e. binary-valued input or continuous-valued input. Each of these categories could be further sub- divided into supervised learning and unsupervised learning techniques. Unsupervised learning algorithms use patterns that are typically redundant raw data, having no labels regarding their class membership or association. During this mode of learning, the network must discover for itself any possible existing patterns, regularities and separating properties. The parameters of the network undergo changes when discovering the properties mentioned previously and setf-organising occurs (Turban & Trippi, 1996: 16). During unsupervised R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 236 SAJEMS NS Vol 5 (2002) No I learning only input stimuli are presented to the network. Examples of this type of learning are adaptive resonance theory and Kohonen self-organising feature maps (Kohonen, 1988). Once an acceptable mapping solution is obtained, it is possible to use the output obtained from the SOM neural network to develop an induction model for obtaining rules from decision trees. Decision trees are a way to represent mathematical regularities or relationships underlying a set of observations obtained for a particular problem. In addition, decision trees are hierarchical structures that partition the set of observations to explicitly relate a number of independent variables, known as the attribute, to one or more discrete dependant variables, or classes (Gray, 1990: 41-42). A decision tree consists of nodes and branches. Each node is either a decision node that consists of a test on an attribute that partitions the current subset of observations into two or more smaller subsets, or a terminal node that classifies the remaining subset of observations in that node with a particular class label. Branches indicate the path that must be followed as decisions are made at each decision node until a terminal node is reached (Ben-David & Mandel, 1995: 110). Decision trees are compact and are therefore readily understood. However, complex classification problems may cause the decision tree to become unwieldy with a large number of nodes and branches. Although such a tree may be complete and accurate it is often difficult to understand. It is possible to simplify the tree and make it more intelligible by expressing the tree model in terms of so-called If ... Then rules, also referred to as production rules (Crusader Systems, I 998b: 115). Production rules have been widely used to represent knowledge in expert systems and they have the advantage of being easily interpreted by human experts because of their modularity, that is, a single rule can be understoOd. in isolation and does not require a reference to other rules (Kamber, Winstone, Gong, Cheng & Han, 1997: 10). 3 NEURAL NETWORKS, INDUCTION OF DECISION TREES AND MULTIVARIATE STATISTICAL TECHNIQUES SOM neural networks provide an opportunity to extract knowledge from data and offer improved performance by overcoming various limitations associated with multivariate statistical techniques such as Cluster Analysis. Although SOM neural networks also have limitations in respect of explanation, they offer advantages in terms of learning ability, flexibility, adaptation and knowledge discovery (Goonatilake, 1995: 21). The nearest neighbour algorithm, which is R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS VolS (2002) No 1 237 the premise for a SOM neural network, is a refinement of existing cluster techniques, in the sense that both use distance in some feature space to create structure in the data. The nearest neighbour algorithm offers more refinement, as part of the algorithm provides a way of automatically determining the weighting of the importance of predictors and how distance will be measured within the feature space (Berson, Smith and Thearling, 2000: 144-4S). In the context of this research a comparison is provided of SOM neural networks and Cluster Analysis, while the advantages and disadvantages of inducing rules from decision trees are also highlighted. The primary advantages ofSOM neural networks over Cluster Analysis include the following: SOM neural networks are more robust than cluster techniques. The use of Cluster Analysis will provide a cluster solution even if no natural clusters exist in the data (Mitchell, 1994: 8). Various assumptions about the underlying distribution of the data are required to use Cluster Analysis, while SOM neural networks do not require any assumptions. The number of clusters requires specification when using Cluster Analysis, while SOM neural networks cluster data naturally based on assigning an incoming signal to the segment having the nearest weight vector (Venugopal & Baets, 1994: 36-37). The relevance of SOM neural networks for modeling tourism data stems from the need to provide a segmentation solution that will enable decision-makers to allocate the scare financial resources for more focused target marketing. Besides overcoming the limitations of Cluster Analysis, SOM neural networks enable more refined analysis of tourist behaviour and also provides a level of predictive ability to track changes in tourist profiles and behaviour. The goal of classification trees is to predict or explain responses on a categorical dependent or class variable and on this basis has much in common with multivariate techniques like Discriminant Analysis. For instance, the hierarchical nature of classification trees is illustrated by a comparison of the decision-making procedure employed in Discriminant Analysis. Superficially, the Discriminant Analysis and classification tree decision processes might appear similar, because both involve coefficients and decision equations. However, the difference of the simultaneous decisions of Discriminant Analysis from the hierarchical decisions of classification trees should be noted (Statsoft, 2001). For the purposes of this article, the advantages and disadvantages of tree-based methods are discussed as a means to. enhance the classification and profiling of R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 238 SAJEMS NS Vol 5 (2002) No 1 customer segments. The advantages of decision trees and rule induction in mining data for classification, profiling and decision-making, include: Decision trees and rules explicitly represent the relationships discovered by the decision tree or rule induction algorithm and can be effectively understood and analysed by decision makers. The decision path followed by the tree or rule set can be easily followed when determining the class of a new observation, leaving nothing unknown or implied. 1f ... Then rules can be imported into the rule base of decision support systems and can be integrated with a knowledge database. The influence and relevance of attributes with regard to classification of the set of training data is shown. The induction of decision trees is very flexible as it easily copes with discrete and continuous attributes. Regression and neural network classification require the encoding of discrete attributes as either a number of a continuous scale or as orthogonal unit vectors (Crusader Systems, 1998b: 119-20). The disadvantages of decision trees are the following: The algorithms to induce trees are dependent on the quality and appropriateness of the attributes of the training data. If the attributes chosen do not adequately represent the underlying relationships in the data then the induction algorithm will group most of the data into one class with few nodes. The divide-and-conquer strategy partitions training data into smaller and smaller sub-sets. The algorithm uses less and less information about the entire training data as tree growth continues. The foundation for decisions based on the attributes lower down in the tree is diluted due to a smaller baSis ofinformation. The induced rules may require further processing and analysis after the initial induction cycle to improve comprehension (Crusader Systems, 1998b: 120-121). The limitations of Ouster Analysis provide the rationale for the use of artificial SOM neural networks. Decision tree models are inexpensive to construct, easy to interpret and are easy to integrate with database systems and therefore their use for classification and profiling customer segments outweigh the limitations of inducing decision trees. R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS Vol 5 (2002) No 1 239 3 MEmOD On the basis of the research problem specification and the objectives of the research, the use of an unsupervised learning algorithm to group the data into segments and a decision tree algorithm for the profiling of tourist segments was required. Decision trees and rules are also generally used for classification purposes. A SOM neural network is used for the initial grouping of tourists, while the univariate-split decision tree algorithm is used for profiling the segments. The research design is divided into two stages. The flISt stage involves the development of a SOM neural network and the second stage involves using the segment classification obtained from the SOM neural network model as output . to create a decision tree from which rules are induced for segment profiling. The modeling process uses two different learning algorithms to group the respondents and allow the induction of rules from a decision tree. The rules obtained from the decision tree induction could also be useful for the classification of new tourists from data obtained in follow-up surveys. 3.1 Sampling procedure, data capture and scope of data The sample was geographically stratified in line with the distribution of the urban population nationally, and starting points were selected using a geo- demographic sampling grid. Persons that reside in urban areas within South Africa and bad visited the Western Cape at least once during 1997 and 1998 qualified for inclusion in the survey. A record was kept of unsuccessful contacts in order to be able to calculate the proportion of the urban population that visit the province. A total of 5 642 contacts were required to achieve the final sample of 1 630 respondents. Personal in-home interviews were conducted. The questions represented a broad mix of trip, demographic, socio- economic and geographic characteristics. The nature of the data was predominantly categorical and nominal. The data was captured using a software package, Survey System (Creative Research Systems, 2001). The Survey System is tailored for survey research conducted using questionnaires. The software handles all phases of survey projects, from creating questionnaires through data entry, interviewing to producing tables, graphics and text reports. The data used for the research presented in this paper is applicable to the domestic tourist market in South Africa. It is assumed that the behaviour of domestic tourists will not differ significantly since the period of the survey to the time the analysis was conducted. However, it may be relevant for Western R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 240 SAJEMS NS Vol 5 (2002) No 1 Cape Tourism to conduct a second survey in the near future to ascertain whether or not the profile and behaviour of tourists are changing. 3.2 Nature and design of a self-organising neural network model and tbe creation and induction of decision trees The design of a SOM neural network is divided into several distinct steps. Several authors including Deboeck (1995), Masters (1993), Blum (1992) and McCord-Nelson & Illingworth (1993) have outlined a series of steps for building a neural network model. The eight-step procedure proposed by Kaastra and Boyd (1996: 219), which encompasses many of the steps proposed by the abovementioned authors, is adapted for SOM neural networks and presented in Table 1. The steps for the creation of decision trees and the induction of rules are also presented in Table 1, as adapted from Brodley & Utgoff(l995: 49). Table 1 A sequence of steps used to design a SOM neural network model and create a decision tree for the induction of rules Steps for SOM neural network Steps for specification of decision trees specification and rule induction Step 1: Data collection Step 1: Data collection Step 2: Variable selection (number of Step 2: Attribute selection (number of inputs) inputs, and output) Step 3: Data pre-processing (e.g. Step 3: Data pre-processing (e.g. removal normalising, log transforma- of outliers and transformations) tion, standardisation, scaling) Step 4: Selection of training, Step 4: Selection of training, validation validation and test sets and test sets Step 5: Specific.ation ofSOM neural Step 5: Specification of decision tree network training parameters and training parameters and configuration values configuration values -Number of input neurons -Number of input neurons -Percentage iterations for -Stopping rules (Uniform class, constant initial learning rate uniform vector, minimum observations) -Learning rate increment -Splitting criteria (Gain, gain ratio or gain rationog (Depth) -Neighbourhood radius -Pruning algorithm (error based pruning, reduced error pruning) -Radius increment R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS Vol 5 (2002) No 1 241 Table I continued Steps for 80M neural network Steps for specification of decision trees specification and rule induction Step 6: SOM neural network training Step 6: Rule induction specifications specifications -Initial weights (upper and -Rule generalisation algorithm lower bound) (upper confidence limit, correctness generalising and best difference) -Presentation of records -Rule set optimising algorithm (Minimum Description Length Principle) -Number of iterations (epochs) Step 7: Evaluation criteria Step 7: Evaluation criteria for both the decision tree and rule induction (before and after pruning on the training data and validation set) Step 8: Model deployment Step 8: Model deolovment Source: Adapted from Kaastra and Boyd (1996: 219) and Brodley and Utgoff (1995: 49) The development of any neural network or rule induction model, is based on a thorough knowledge of the research problem. In addition, the procedure described in Table I is not one-off, but may involve revisiting steps between the training and validation of the model to reassess the input variables and network parameters. All the variables included in the survey that had no missing data, were considered for the modeling process. An exploratory data analysis (EDA) was conducted to provide an indication of the distribution of the variables. Both Box and Whisker plots and histograms were used to visualise the distribution of the data variables. These descriptive statistical techniques were used together with several descriptive statistics (i.e. mean, median, standard deviation, skewness and kurtosis) to further describe the data. The EDA is important to ensure non-inclusion of data variables with substantial positive or negative skewness, as the natural clustering algorithm of the SOM neural network would tend to consider the data as one group and distinguishing between groups would be more difficult due to the nature of the algorithm. Based on the outcomes of the exploratory data analysis and the importance of selecting the right mix of variables for creating a SOM neural network model, 10 variables, which include trip, demographic, socio-economic and geographical characteristics, are included for the SOM neural network modeling process. The data captured in Survey System was exported in a delimited R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 242 SAJEMS NS Vol 5 (2002) No I fonnat into Excel, which also facilitated the importation of the data into the modeling software. In order to enhance the description of the segment profile, six additional variables were included for the creation of the decision tree and the induction of the rules. Table 2 lists the 10 variables used for the SOM neural network modeling and the 16 variables for the induction of rules from a decision tree. Table 1 Variables used for the construction of the decision tree and rule Induction Variables used for SOM modeling Variables used for creation and procedure induction of decision trees Main purpose of visit Main PlllJIOse of visit c----=: .... Region visited in the Western Cape Region visited in the Western Cape province province Duration of stay Duration of stay Total spent on first trip Total spent on first trip Likelihood to visit the Western Cape Likelihood to visit the Western Cape again again Ethnic group Ethnic group Occupation Occupation Origin of tourist Origin of tourist Income group Income group Age Age Education Education Number visits to the Western Cape in past 2-years Time of the year visit to the Western Cape Infonnation sources of the Western Cape Arrangements for visit Likelihood to visit the Western Cape province again Gender The software package used for part of the exploratory data analysis and the modeling procedure is Basic Modelgen (Crusader Systems, 1998a). In addition, the software package, Statistica, is used for most of the exploratory data analysis (Statsoft. 1998). R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS Vol 5 (2002) No 1 243 3.2.1 Self-organising neural networks The following discussion of the SOM neural network modeling process refers to the steps listed in Table 1. Data variables identified for modeling were scaled or recoded to assume values between 0 and 1 or 0 and I respectively. In this manner each data record is considered by the network as continuous valued input or binary-valued input. In order to create the training, validation and test sets, the data set (1 630 records) was randomly sub-divided so that 70 per cent of the data points were allocated to the training set, 10 per cent to the validation set and 20 per cent to the test set. This classification is based on heuristics and on the principle that the size of the validation set must strike a balance between obtaining a sufficient sample size to evaluate both the training and test sets (Kaastra & Boyd, 1996: 223). The training parameters used for final SOM neural network modeling process entails the following measures together with the relevant values. Parameter and Values Learning decrement (per cent) 0.1 Initial weights (per cent) -0.5 (Lower bound); 0,5 (Upper bound) Scaling 0.1 (Lower bound); 0,9 (Upper bound) Neighbourhood radius 5 Radius increment Linear Side N of one map side: 40 which refers to 1600 data point images on a two-dimensional grid Configured values Presentation of data Training cases Test cases Validation cases Number of inputs Random 70 percent 20 per cent 10 percent 10 neurons. The SOM neural network was trained for 1000 iterations and the records were presented to the network in a random manner. The fmdings of the SOM neural network modeling procedure are presented in a following section. 3.2.2 Induction of rules from decision trees The second stage of the analysis entailed the induction of rules from a decision tree. The decision tree was created using 16 variables including the 10 variables R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 244 SAJEMS NS VolS (2002) No I that were also used for the SOM modeling process. In order to specify an output variable each respondent belonging to a specific segment was mapped back to the original data set. The model specification for the decision tree included 16 input neurons and a single output neuron representing the segment classification of each respondent. The primary aim of this stage of the research is to develop a trained rule induction model which could be used to provide an indication of the profile of each segment and also assist with the classification of new respondents based on the exiting rule set for each segment. The parameters and configured values used to create the decision tree and induce rules are as follows: Decision tree and rule induction specifications Stopping rules Splitting criteria Pruning algorithm Rule generalisation algorithm Model specification Training cases Test cases Number of inputs Number of outputs Uniform class Minimum observations (2) Gain ratio None specified Correctness generalising 90 percent 10 percent 16 neurons I neuron. Many algorithms are proposed in literature, to induce a decision tree and thereafter use the tree structure as a basis for deriving a set of production rules. It is beyond the scope of this text to describe each of these algorithms, especially those that induce rules from scratch. For the purposes of this article, the well-known ·C4.S algorithm is used for the induction of the decision tree (Quinlan, 1993). The C4.5 algorithm generates a classification-decision tree for the given data set by recursive partitioning of data. The algorithm considers all the possible tests that can split the data set and selects a test that gives the best information gain. For each discrete attribute, one test with outcomes as many as the number of distinct values of the attribute is considered. In addition, for each continuous attribute, binary tests involving every distinct value of the attribute are considered. In order to gather the entropy gain of all these binary tests efficiently, the training data set belonging to the node under consideration is sorted for the values of the continuous attribute and the entropy gains of the binary cut based on each distinct value are calculated in one scan of the sorted data. This process is repeated for each continuous attribute (Kamber, et ai., 1997: 4). R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS Vol 5 (2002) No I 245 For the purposes of this research, the uniform class together with the minimum observation stopping rules are used. The uniform class stopping rule is generally used in classification problems and will fire if all the cases at a node are of the same class. The minimum observations stopping rule will fire when the number of cases in a node are less than or equal to the specified number, which is two in this case. The use of two stopping rules is to prevent a large overtrained tree. The Gain ratio is used as the splitting criteria in the C4.5 algorithm and is considered as a measure of impurity (Quinlan, 1993 & Breiman, 1996). It is also a commonly used criterion and generally provides good results. This measure determines the ratio of the extent of information contained within the new smaller subsets of data after a given split in the data has been performed over that contained in the previous larger sub-set of data. The attribute and its associated value on which to split are chosen as the pair that minimises the gain ratio (Quinlan, 1993). The rule generalisation algorithm used in this study simplifies the predicate of a rule. The correctness generalising algorithm keeps the optimal predicates to maximise the number of correct cases identified by a rule (Crusader Systems, 1998b: 71). The C4.5 algorithm uses a criterion that pessimistically estimates the expected classification error rate of a generalised rule to decide which part of the antecedent to remove. The change in the rule that produces the lowest error rate is selected (Quinlan, 1993). The outcome of the induction of the decision tree is discussed in a following section. 4 FINDINGS OF THE MODELING PROCEDURE 4.1 Findings of tbe SOM neural network model It was possible to distinguish three clear segments among domestic tourists that visit the Western Cape. Figure 2 is an illustration of the self-organising feature map, which is obtained through the unsupervised SOM neural network modeling process. Each record in the training set corresponds to a single unit, namely the best matching one. Thus, the unit represents the record's image on the feature array (consisting of 40x40 units), which implies 1600 output units in this case. The self-organising feature map is two-dimensional and depicts the natural segments obtained from the interrelationships between the 10 variables used for the modeling of the data. R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 246 SAmMS NS Vol 5 (2002) No I Figure 1 Two-dimensional self-organising feature-map of the trained data >- Figure I indicates four possible segments, three of which are defmitive, while the fourth is smaIler and appears to be developing. However, for the purposes of this analysis, the respondents that fonn part of the "emerging" market segment are included together with respondents classified as part of segment 3. Table 3 indicates the size and number of respondents classified per segment. Table 3 Number of respondents per segment for tbe traveller group Segment number Total respondents per Percentage setZment contribution 1 693 42.52 2 421 25.82 3 516 31.66 Total 1630 100.00% Table 3 shows that Segment 1 is the larger of the three segments and that 42.52 per cent of the respondents have a similar profile of attributes, views and characteristics. Among the remaining respondents, 31.66 per cent have a similar profile and are grouped in Segment 3, while 25.82 per cent of the respondents in Segment 2 could be considered as a homogeneous group. The three segment profiles obtained from the classification of each tourist based on the SOM neural network is presented in Table 4 with acronyms for each of the segments based on the profile. R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS Vol 5 (2002) No 1 Table 4 Profiles of the three segments identified from the self- organising feature map Profile ofsegment 1: "Adventurists" 247 The largest majority of tourists who fonn part vf segment 1 either visit the Western Cape for holiday purposes or visit friends and relatives. This group also spends most of their vacation period in Cape Town and the Cape Peninsula. Other areas of interest to this group are the West Coast and Garden Route. An average duration of stay for this segment is 16 days (median 14 days), while most of the tourists spend between R3 000 and R4 000 on average (median R3 000 to R4 000), irrespective of the purpose of visit. The majority of this segment is from the White population group, of which almost 40 per cent is self employed or work as clerical or sales staff. A small portion (11 per cent) of this group occupies professional positions in the business fraternity. Over 63 per cent of the tourists from this segment come from Gauteng. These tourists earn an average of between RIO 500 and R12 999 per month (median R8 500 to RIO 499) and are between the ages of 35 and 39. Over 50 per cent of these individuals have a matric qualification, while 37 per cent have an additional diploma or university degree. Prome of segment 2: "Yuppies" Tourist classified under segment 2, visit the Western Cape for reasons other than VFR., leisure or business. However, almost a fifth do also visit the Westem Cape for business. They visit primarily Cape Town or the Cape Peninsula in the Western Cape and spend approximately 12 days (median 10 days) in the province. They spend a low average total of between R2 00 1 and R3 000 (median Rl 001 to R2 000) during this time. Interestingly, the ethnic spread among this segment is similar to that of segment 3, for White, Black and Coloured/Asian groups. Almost 40% of this group is professionals, while about 37% is middle management, self employed, clerical or sales staff. Approximately a third of these tourists come from the Eastern Cape, while close to 3()OIo come from Kwazulu-Natal. These tourists earn an average of between R13 000 and R15 999 per month (median RIO 500 to R12 999) and are between the ages of 35 and 39 (median 35 - 39 years of age). Sixty per cent of this segment has a matric qualification or diploma and almost 40% a university degree. R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 248 SAJEMS NS Vol 5 (2002) No I Profile of segment 3: "Content" Tourists classified as part of segment 3, visit the Western Cape either to see friends or relatives or to have a holiday. Almost 15 per cent of this segment also visits the province for business purposes. More than 80 per cent of these tourists visit Cape Town or the Cape Peninsula during their visit to the Western Cape and spend approximately 12 days (median lO days). This segment spends a low average total of between RI 00 I and R2000 (median R 1001 to R2000). As mentioned in segment 2, the ethnic spread among this segment is similar regarding White, Black and Coloured/Asian groups. Almost 40 per cent of this group are housewives, students, retired persons or semi-skilled workers. Approximately 40 per cent of these tourists come from Gauteng, while almost a third are from Kwazulu-Natal and the Free State. These tourists earn an average of between R4 000 and R4 999 per month (median R4 000 to R4 999). The average age of this group is between 40 and 44 years (median 40 44 years of age), while more than three-quarters has a matric, Std 8, Std 9 or Std 9 with a diploma. Note: The basis for the creation of the segment profile, is frequency tables compiled for each variable included in the SOM neural network modeling procedure and several descriptive statistics. These segment profiles should be considered as a means to distinguish between the different segments on an overall basis. The "Adventurists" seem to be scattered throughout the province and visit other areas in addition to Cape Town and the Peninsula. They are less trendy than the "Yuppies", who prefer to spend their holiday period in areas like Cape Town and the Peninsula that seem to have a certain vacation atmosphere. Interestingly, the "Yuppies" eam more than the "Adventurists" but spend less during their time in the Western Cape. The tourists in the "Content" segment are, by virtue o( its profile, older than those of the other segments; retired, or students and less qualified. Their expenditure over the vacation period is also lower than the other two segments. The "Content" appear to be more content with life and are less energetic and adventurous than tourists whose profile fits in the other two segments. 4.1.1 Implications of tbe segment classification for management decision making The three profiles above provide three different behavioural segments, portrayed by the acronym assigned to each. Consequently, decision makers, could develop more focused marketing strategies instead of the less generic ones often derived from the supply-side tourism industry (e.g. attractions). These segment profiles could be incolporated into existing tourism positioning R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS Vol 5 (2002) No 1 249 strategies, by more specifically indicating to decision makers the kind of tourists the Western Cape would target in the context of the existing and envisaged tourism framework of the province. In addition, the research conducted in this paper will also allow decision makers to track the behaviour of tourists by conducting similar surveys in the future and also determine the changing profile of key market segments for the Western Cape. Decision makers could also, from a similar analysis performed on tourists that do not visit the Western Cape, determine corresponding profiles and identify potential tourists that fit the profile of those tourists the industry decision makers in the Western Cape would want to attract. 4.2 Findings of the creation and induction of a decision tree The decision tree and rule induction modeling could also be used to obtain a broad profile of a segment based on a specific set of rules. In addition, the rules could also be used to classify tourists that take part in future or follow-up surveys. By using the variables indicated in Table 2 as inputs and considering an output variable representing the segment classification of each respondent, it is possible to compile a decision tree from which rules could be deduced. The creation of the decision tree and induction of rules are based on the set criteria listed in section 3.2.22• The rules indicated in Table 5 and in the Appendix are interpreted as If ... Then rules. This implies that should a tourist assume or adhere to the specifications for an individual rule, it would be possible to assign them to a certain segment classification. In this manner they could assume one of the profiles described in Table 4. For instance, consider Rule 13 in the Appendix and the frrst rule applicable to segment 3 in Table 5. A description of the rule indicates that 167 cases were correctly classified as segment 3, while one case was incorrectly classified. The accuracy of the rule is 99,4 per cent which is based on the test set (unseen data). Rule 13 could be described in the following manner: IF the tourist spends less than R3000 and (s)he is a ''non-professional'' and predominantly African, Coloured or Asian, and bas a rnatrlc or lower education qualification, and earns less than R6 500 per month and is from the Free State, Northern Cape, Kwazulu-Natal or the Eastern Cape, mEN the tourist would have a similar profile to the 421 respondents grouped within Segment 2. The individual rules for the other segments are used in the same manner to describe a particular segment or classification of a tourist. R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 250 SAJEMS NS Vol 5 (2002) No I Table 5 Examples of rules for the different segments Rules for seement 3 15161'-*_-I-_.--=R=ul=es;;;;...;;;.;;fo:.;:.r~se=~2:o;;;;lm=e:.::n;.;:.t.:;:..1.J.-146='..=3.1...-1---1 IF a tourist spends less than R3000 IF the tourist is younger than 60 and is a non-professional and years of age and is likely to visit the predominantly African, Coloured or Western Cape in the future and spent Asian, and have a matric or lower more than R3 000 during the visit, education qualification, and earn and is predominantly white and is less than R6 500 per month and are from the Northern Provinces of the from the Free State, Northern Cape, country and earns more than R8 500 Kwazulu-Natal or the Eastern Cape per month ... or ... or IF the tourist visits Cape Town and the Peninsula and total spending is less than R2000 and is predominantly African, Coloured or Asian and bas a matric or lower qualification and earns R6 500 or less and is from the Free State, Northern Cape, Kwazulu-Natal, or the Eastern Cape THEN the tourist would have a similar profile to the 516 respondents grouped within Segment 3. IF the tourist visits other areas than the Garden Route and Cape Town and the Peninsula, and is not from the Eastern Cape, but the other provinces, and earns more than R8 500 per month and is a white collar employee .... THEN the tourist would have a similar profile to the 693 respondents grouped within Segment 1. Rules for s~ent 2 [42~ I IF the tourist is a white collar worker, and has a matric or higher education, and is not from the northern provinces of South Africa and is predominantly African, Coloured or Asian, and spent more than R3 000 during the visit to the Western Cape ... or IF the tourist spends 15 days or less in the Western Cape, and is from the Free State, Northern Cape or the Eastern Cape, and has a university degree ... THEN the tourist has a similar profile to the other 421 tourists which forms part of segment 2. * The value mdlcated m parenthesIs represents the number of tOUllSts for each of the segments. R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS Vol 5 (2002) No 1 251 Table 5 clearly indicates the class (segment), the "conditions" applicable to each of the attributes and the classification accuracy of the rule. These rules, which are detennined in a quantitative manner, could also be combined with rule sets (qualitative infonnation) obtained from experts. The use of SOM neural network models together with the induction of rules from decision trees requires clear research objectives, domain knowledge of the specific problem, representation of appropriate attributes and clarity on the definition and acquisition of data. 5 CONCLUSION Inadequate market segmentation and clustering problems could cause enterprises to either miss a strategic marketing opportunity or not cash in on a tactical campaign. Market segmentation has not only developed as a tool to segment markets and identifY target markets, but could also be used at a higher level to further assist an enterprise to understand the relationship with its customers. The need for in-depth knowledge of customer or tourist segments and the need to overcome the limitations of non-linear problems require a different approach. For instance, SOM neural network models based on artificial intelligence technology can be developed to create clusters based on combinations of natural characteristics within a set of customer data (e.g. purchase history, demographic attributes, phsychographic characteristics, etc.). The research demonstrates the usefulness of artificial intelligence technology as a means of grouping respondents and for profiling existing respondents by using a rule set applicable to each segment. The SOM neural network application, which could be considered as an enabler, provided three segments (classes) that could be used to induce rules from decision trees. The rules provide decision makers with clear and concise indications of the segment profile of existing tourists based on an individual rule within a larger rule set. In addition, the rules sets for each segment could be used to classifY tourists that partake in follow-up surveys. The artificial intelligence application presented in this paper provides a mechanism for analysts to use if the objective of the research is to create customer segments and to describe the segments through a rule set applicable to each segment. The findings demonstrate that the analysis of data from surveys can be taken to a high level through the provision of knowledge represented in sets of rules also by overcoming the limitations of cluster techniques such as Cluster Analysis. R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 252 SAJEMS NS Vol 5 (2002) No I ENDNOTES I would like to express my gratitude to Western Cape Tourism for providing the data to conduct the research presented in this article. The usual caveat applies. 2 The decision tree and the additional rules (i.e. with accuracy of less than 90 per cent) are available on request. R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS Vol 5 (2002) No I 253 APPENDIX Rules with accuracy of larger than 90% (A legend is provided to assist with the intelJ""tation of the rules) 1468 Training Cases and 162 Validation Cases Segment I A; Segment 2 = C; Segment 3 B Rule 13'· B 167, I (99.4%) Rule 9 * .. B 88,1 (98.9%) Rule 72 •• B 79,3 (96.2%) totspend<4 mainarea ::- 8 educalion < 4 occupation >= 6 totspend< 2 incgroup < 10 race>= 2 race >=2 race>= 2 education <5 education < 5 province >= 6 iocgroup< 9 incgroup<9 occupation < S orovince< 6 province <6 Rule 153" A 189, \(99.5%) Rule 96'· A 131,2 (98.5%) Rule 39 .. A 208, 7 (96.6%) age < 10 incgroup >= 6 incgroup < 14 'Iildyvisit < 3 province >= 4 province >= 5 toupend>=4 race < 2 totspend >= 5 race < 2 mainarea < 5 race <2 province >= 6 incgroup < 10 incgroup >= 9 incgroup >= 10 occupation >= S Rule 7 •• B 131,2 (98.5%) Rule 114 •• A 179,5 (97.2%) Rule4S·· B 109,4 (96.3%) education < 4 province >= 5 dayspent < 10 race >= 2 age <6 mainarea >= 7 iocgroup<9 incgroup >= 7 incgroup<8 province<6 race <2 education<5 toupend>=4 toupend<4 incgroup < 10 province >= 6 Rule 23 •• A 117,2(98.3%) Rule 52 .. B 94,4 (95.7""') Rule 120" C 50,2 (96%) mainarea< 6 age>=S occupation < 13 race < 2 totspend < 2 educalion >= 5 province >= 3 incgroup < 10 province < 7 iocgroup >= 9 educalion < 5 race>= 2 occupation < 8. province >= 6 totspend >= 4 ioc2fOlJ1)<1O Rule 151·' A 178,3(98.3%) Rule SO·, B 73,2 (97.3%) Rule 127" B 95,3 (96.8%) occupation < 12 occupation >= 6 totspend<4 totspend >= 3 dayspent >= 10 age>=4 incgroup;- II mainarea>=7 incgroup < 11 race < 2 ammge>= 2 occupation >= 10 province >= 6 mcgroup < g province<6 occupation >= 8 education < 5 iocgroup >= 10 totspend<4 race<2 Rule 117·· A 18, I (94.4%) Rule 69·" A 297, lS(94.9%) Rule 104" C 51,3 (94.I%) gendcr< 2 purposc< 8 dayspent < 15 arrange >= 4 totspend >= 4 province < 5 race<2 race <2 education >= 6 totspend>=4 province >= 6 occupation >= 8 iocgroup < 10 occuD8tion < 8 Rule 112 ... B 70, 4 (94.3%) Rule 140 •• A 133, 7 (94.7%) Rule9S u B 124,8 (93.5%) occupation >= 12 mainarea <7 iocgroup< 6 iocgroup<7 race < 2 totspend <4 mainarea >= 8 province >= 6 occupation >= 8 iocgroup >= 10 R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 254 SAJEMS NS Vol 5 (2002) No 1 Rule2"B 17, I (94.1%) Rule 25 .. B 139, 10(92.8"10) Rule 84 .. C 86, 7 (9\.9%) totspend < 3 incgroup < 10 purpose < 3 province ::>= 3 purpose < 2 mainarea >= 8 race < 2 occupation >= 6 incgroup >= 8 education < S totspend <3 dayspent >= 5 incgroup < 9 province>=3 ' education>= 5 province < 6 province <6 race>=2 occupation < 8 OCCUIlation < 8 Rule 17 •• C 90,8 (91.\%) Rule 40" A 12, I (91.7%) Rule 57 •• A 237, 22 (90.7%) province < 4 purpose < 2 dayspent >= 5 incgroup >= 8 incgroup >= 14 totspend >= 3 education >= S province>= 5 incgroup >= 8 incgroup<9 totspend>=S , education < 5 race<2 I mce<2 province >= 6 occupation < 8 Le&ead for tile dilfereat variables Included In tile rule Induction modellnll: I Number of times visited the Western Cape: Purpose of visit to tile Western Cape I Once I Visit friends or relatives 2 Twice 2 Holiday '3 Three Times 3 Business pwposes 4 Foor Times 4 Study pwposes 5 More than 4 i 5 Medical treatment Maill area visited willie iB Western Cape I West Coast 2 Winelands 3 Breede River 4 Overberg 5 Central Karoo ,6 Klein Karoo ,7 Garden Route ! 8 Cane Town and Cape Peninlula Iaformadoll _reel for tile Western Cape I Travel Agency 2 Tourist Bureau 3 Friends/relations 4 AA 5 Magazines 6 Newspapers 7 Internet ,8 Nowhere/myself 9 Previnus visillllexperience i 10 Tunesharc II RadiolTV 12 PampbJeIllIBrocbures 13 ChW'Cb 14 Organised tours 15 Sports Organisation 16 WodcIBusiness i 17 ScbooVTedl 18' Other ; 6 Conference 7 Other 8 S~~~~~~~ __ ~~ __ ~~ Time of year of visiting the Western Cape I December/January (Swnmer) 2 February/April (Autornn) 3 May/August (Winter) 4 September/NovembeI(Spring) Total SpeDt on visllirrespective of tile purpose I RO-lOOO 2 RlOOI-2000 3 R2001-3000 4 R300I-4000 5 R400I·SOOO 6 RSOO 1-6000 7 R600I·7000 8 R7001-8000 9 8001·10,000 10 More than RIO,OOO R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS Vol 5 (2002) No 1 255 Arrangements made for visit to tbe Western UkeUbood to visit the Western Cape again Cape 1 Will definitely visit 1 Travel Agent 2 Will probably visit 2 Self/Other in party 3 Might visit 3 Your company 4 Will probably not visit 4 Family/friends .5 Will defmitely not visit 5 Airline Office 6 Tour Operator 7 Other 8 Church 9 School teacher Income group Occupation 1 Up 10 1499 I Professional 2 1500-1799 2 Senior Management 3 18()()"1999 3 Middle Management 4 2000-2499 4 JW1ior Management 5 2500-2999 5 Self~ployed 6 3000-3999 6 Clerical/ Sales 7 4000-4999 7 Tradesman / Skilled 8 5000-6499 8 Semi-skilled 9 65()()"8499 9 Unskilled 10 85()()"10499 10 Housewife 11 105()().. 1 2999 II Student 12 13000-15999 12 Pensioner/retired 13 16000-19999 13 Other 14 20000-24999 14 Unemployed/not working 15 25000-29999 16 30000-34999 17 35000-39999 18 40000+ 19 Confidential Province of origin Age grouping I WCape 1 18-19 2 ECape 2 20-24 3 NCape 3 25-29 4 Free State 4 30-34 5 KZN 5 35-39 6 Northwest province 6 40-44 7 Gauteng 7 45-49 8 Mpumalanga 8 50-54 9 Northern Province 9 55-59 10 60-64 11 65+ 12 Confidential Level or edaeatlon Gender 1 Less Chan Std6 1 Male 2 Sid 6-7 2 Female 3 Sid 8f919+Dipioma 4 Matrie Ethnkgroup .5 MatriclDiploma 1 White 6 University Degree 2 Black 3 Coloured! Asian R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . 256 SAJEMS NS Vol 5 (2002) No 1 REFERENCES BEN-DAVID, A. & MANDEL, J. (I995) Classification Accuracy, Machine Learning vs. Explicit Knowledge Acquisition, Machine Learning, 18: 109-114. 2 BERSON, A., SMITH, S. & THEARLING, K. (2000) Building Data Mining Applications/or CRM, New York: McGraw-Hili. 3 BLUM, A. (1992) Neural Networks in C++: An Object-Orientated Framework/or Building Connectionist Systems, New York: Wiley. 4 BREIMAN, L. (1996) Technical Note: Some Properties of Splitting Criteria. Machine Learning, 24: 41-47 5 BRODLEY, C.E. & UTGOFF, P.E. (1995). Multivariate Decision Trees, Machine Learning, 19,45-77. 6 CRUSADER SYSTEMS (1 998a) Basic Modelgen, Version 1.6. Pretoria. 7 CRUSADER SYSTEMS (1998b) Basic Modelgen Users Manual: Pretoria. 8 DEBOECK., GJ. (1995) Trading on the Edge: Neural. Genetic and Fuzzy Systems/or Chaotic Financial Markets, New York: Wiley. 9 DIBB, S. & SIMKIN, L. (1991) Targeting segments and positioning. International Journal 0/ Retail and Distribution Management. 19(3): 4- 10. 10 GOONATILAKE, S. (1995) Intelligent Systems for Finance and Business: An Overview, In S. Goonatilake and P. Treleaven, Intelligent Systems/or Finance and Business: 1-28, New York: Wiley. 11 GRAY, NAB. (1990) Capturing Knowledge through Top-Down Induction of Decision Trees, IEEE Expert: 41-51. 12 KAASTRA, I., & BOYD, M. (1996) Designing a Neural Network for Forecasting Financial and Economic Time Series, Neurocomputing, 10: 215-36. 13 KAMBER, M., WINSTONE, L., GONG, W., CHENG, S, & HAN, J. (1997) "Generalization and Decision Tree Induction: Efficient Classification in Data Mining", Paper presented at the International Workshop on Research Issues on Data Engineering. Birmingham, England: 1-25. 14 KLIMASAUSKAS, C.C. (1996) "Applying Neural Networks", In R.R. Trippi and E. Turban, Neural Networks in Finance and Investment: 45- 69, New York: McGraw-Hill. 15 KOHONEN, T. (1988) Self-Organisation and Associative Memory, New York: Springer. 16 KOTLER, P. (2000) Marketing Management, Englewood Cliffs: Prentice-Hall. 17 LIPPMAN, R.P. (1989) An Introduction to Computing with Neural Nets. IEEE ASSP Magazine: 4-22. R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) . SAJEMS NS Vol 5 (2002) No 1 257 18 MASTERS, T. (1993) Practical Neural Network Recipes in C++. New York: Academic Press. 19 MCCORD-NELSON, M. & ILLINGWORTH, W.T. (1993) A Practical Guide to Neural Nets, New York: Academic Press. 20 MITCHELL, V. (1994) "How to Identity Psychographic Segments: Part I", Marketing Intelligence and Planning, 12(7), 4·10. 21 PATTIE, D.C. & SNYDER, J. (1996) "Using a Neural Network to Forecast Visitor Behaviour", Annals of Tourism Research. 23(1): 151-64. 22 QUINLAN, l.R. (1993) Programmes for Machine Learning, San Mateo: Morgan Kaufinann. 23 RYMAN-TUBB. N. (1993) "The Use of Neural Networks to Identity the Characteristics of Holiday Markers", The Journal of Database Marketing. 1(2): 140-49. 24 SAARENVIRTA, G. (1998) "Mining Customer Data" DB2 Magazine. 3(3): 10-20. 25 STATSOFT (1998) Statistica Version 5.5. Tulsa, OK. 26 ST A TSOFT (200 1) Electronic Text Book - www.statsoft.com. Tulsa. OK. 27 CREATIVE RESEARCH SYSTEMS (2001) Survey Systems, Version 8.0. Petaluma, CA. 28 TURBAN E. & TRIPPI, RR (1996) Neural Network Fundamentals for Financial Analysts, In RR Trippi and E. Turban, Neural Networks in Finance and Investment: 3-24, New York: McGraw-Hill. 29 VENUGOPAL, V. & BAETS, W. (1994) ''Neural Networks and Statistical Techniques in Marketing Research: A Conceptual Comparison" Marketing Intelligence and Planning, 12(7), 30-38. R ep ro du ce d by S ab in et G at ew ay u nd er li ce nc e gr an te d by th e P ub lis he r (d at ed 2 00 9) .