A Data Mining Approach for Predicting Customers’ Life Time Value in the Automotive Industry
(Research Seminar, April 1st, 2004)

Jacob Zahavi
Tel-Aviv University;
University of Pennsylvania


Abstract
Lifetime Value (LTV) is one of the core foundations of modern CRM (Customer Relationship Management). It is used for supporting both strategic and tactical decisions. Yet, predicting the LTV for any individual customer is not simple, and varies from one industry to the other. Especially difficult is the calculation of the LTV in the automotive industry, which is characterized by long purchase cycles, data availability issues, sparse data, tough competition, changing economic conditions, and others. In this seminar we describe a data mining approach for predicting the LTV at the individual customer level in the automotive industry. The data mining models were imbedded within the LTV economic model to predict the required individual-level parameters.

The basic economic model is:

LTV = Loyalty Probability * Expected Profit per-vehicle-sale * the Discount Factor

The loyalty probability was estimated for each customer by means of logistic regression. The expected profit per-vehicle-sale depends on the transition probabilities between car segments. These probabilities were estimated for each customer using multiple binary logistic regressions, one model for each “major” car segment. The discount factor is defined as the present value of a stream of “$1.00 vehicle” invested by a customer in buying a car M years from today and thereafter every Y years. The specific purchase cycle for each customer was estimated by means of survival analysis. A by-product of the purchase cycle model is the in-market timing probabilities which can be used for targeting customers who are most likely to “look around” for a new car in a given time period.

All these models were combined into a coherent scoring program for calculating the LTV for each individual customer. The model was applied for a leading automotive company in a European country. We will discuss the data mining process, focusing on data availability, the data mining models, the validation process, implementation issues and some adjustments that were made to compensate for the lack of information about competitive make vehicles.