# Factor and Cluster Analysis in CRM

In this age of customer relationship management - or in the more familiar world of just good, plain direct marketing - it is important to segment customers and communicate with them in ways that resonate.Thus, an important goal of direct marketing practitioners is to perform customer segmentation - in other words, to identify uniform groups that the marketer instinctively should treat differently.

There are various ways to segment customers. In some situations, direct marketers can employ common sense. For example, marketers may treat new customers differently from loyal ones. Where common sense needs more data to support a plan of action, RFM has been used successfully for years and still is valuable in some cases.

However, there are so many more variables that a marketer can track these days that may command more relevance to the marketer than straight common sense or recency, frequency, monetary value.

Increasingly, marketers rely on at least two statistical techniques that are more sophisticated as an effective way to segment customers: factor analysis and cluster analysis. These techniques are especially useful when encountering very large data sets that contain many variables.

Statisticians use cluster analysis often in tandem with factor analysis to develop natural customer segments by examining customer data and identifying patterns. Rather than applying an arbitrary characterization scheme, these methods help ensure that customers are grouped and defined by who they are and what they do.

The state of interdependence. Both factor and cluster analyses are multivariate and interdependence techniques.

Multivariate techniques can analyze multiple variables at once and can determine the complex relationships that exist among the variables. For example, a multivariate analysis might look at the interplay among income, education, geographic location and various groups, transaction data or account activity.

Interdependence techniques are used to organize and sort objects and people by similar characteristics. We all do this every day as we describe groups in terms of general characteristics rather than by describing every single object in detail.

If you had only a few variables and a small number of customers, you could act intuitively or conduct simple split tests and probably would not need the more advanced methods of statistical analysis, testing and data modeling.

However, as the cost of data processing and storage has fallen, the number of available variables has increased greatly. You are no longer satisfied with limiting yourself to loyal customers, for example. You want to distinguish among the various types of loyal customers, based on purchase behavior as well as potentially any number of demographic and lifestyle indicators.

As the numbers of variables have increased, so too has the likelihood that some of the variables will be related and that some of the variables reflect part of a more general trend or relationship that, when acted upon, creates a discernible, predictable marketing impact.

Factor analysis is especially suited for dealing with these more common data challenges. It can analyze the relationships among many variables, and it can condense or summarize many variables into a smaller group of factors.

For example, perhaps a consumer retailer is working with six variables known about its customers: income range, home value range, net savings estimates, total spending with the retailer, total items purchased from the retailer and the total number of item categories purchased from the retailer. Through factor analysis, these six variables might be condensed into two factors, which you might call affluence and purchase behavior. Then in any future analysis, you need only consider two representative factors, rather than six distinct variables.

The marketer and data analyst must work together. Factor analysis has limitations, however. By definition, factor analysis will always produce factors, even if they do not really exist. And if you thoughtlessly introduce a bunch of variables into the analysis, your results will likely prove unsatisfactory.

That is why it is important to have marketing and analytics professionals work together. Through an informed dialogue and knowledge about the business, the data analyst then can determine the optimal number of factors to be developed, the specific techniques to be used to derive meaningful relationships among variables, and to avoid or to discount relationships that might prove spurious.

Once you have applied factor analysis to the variables and the correct factors have been isolated, it is time to perform cluster analysis.

Cluster analysis has two primary steps. In the first step, the analyst determines how many mutually exclusive groups exist. In the second step, customers are assigned to one of the groups, based upon their factor scores. To refine and confirm results, many statisticians will perform cluster analysis using both the factors created in the initial factor analysis stage and the initial data variables themselves.

Several methods and algorithms are used to develop groups and assign membership; all of them are theoretically valid, but each will yield different solutions. In practice, the analyst typically develops several different solutions, consults with marketing and selects the one that makes the most sense. The marketing department makes a very important contribution by ensuring that the selected cluster solution is suited to the company's business goals.

Like factor analysis, cluster analysis has weaknesses. Different clustering techniques will produce different cluster solutions, which illustrates the subjective nature of this analysis. And cluster analysis will always produce a solution even if real clusters do not exist.

Finally, there is no statistical foundation for thinking that clusters developed from a small sample will apply to the general population. These limitations dictate the need to have knowledge about the business when such data analysis is performed. And they dictate a regimen of testing to ensure the usefulness of the results.

When used with care, factor and cluster analyses are remarkable tools that can reveal new knowledge about customers that can fuel more effective target marketing efforts.