Tuesday, March 13, 2018

Pricing Model for an "Older" Model -- A Toyota Camry Case Study

Pricing Model for an "Older" Model -- A 2006 Toyota Camry Case Study

While the new automobile market is regulated by the Manufacturer’s Suggested Retail Price (MSRP), the pre-owned auto market is vastly unregulated, causing significant price differences from dealer to dealer. Glitzy showrooms with large advertising budgets tend to have more pricing power than their less ostentatious counterparts, often promoting comparable pre-owned autos at 30-50% higher than the weaker competitors.

A number of online auto information and listing service companies have emerged over the last decade or so, providing very meaningful data and pricing support for the consumers. In addition to making consumers aware of the damage, accident, service history and odometer of the pre-owned cars, they have been offering independent price points as well, hopefully leading to a more structured and secure pre-owned market. In case of untoward purchases resulting from the data failure, some are even offering purchase protections.

In order to arrive at the prices independently, the info companies have been developing their proprietary pricing models, thus predicting prices efficiently and en masse. Since their model development process is not made public, even the most data-savvy consumers are left to guesswork, at best, which has enticed the author to publish this book, detailing his perspective of pricing models for the popular pre-owned vehicles (“Autos,” “Cars,” “Vehicles”). 

The methodologies used and proposed here can easily be duplicated on an existing database or with some data collections. In order to teach the readers how to model the pre-owned auto data, all modeling samples have been extracted from the current institutional listings pertaining to the most popular as well as longest-running domestic and foreign cars. Also, all mileage, damage/accident and ownership data have been verified against some 3rd party databases.

Additionally, in order to make learning easy and pleasurable, a set of structured methodologies will be utilized throughout the modeling process; for example, in examining the predictive relationship between the dealer pricing (dependent variable in the model) and the predictive variables (independent variables), a correlation matrix will be used. Similarly, the Multiple Regression Analysis (“MRA”) will be used to develop the pricing model (“Model”) and predict the line-item prices (“Model Est”). Any generic statistical software or Excel’s Data Analysis module will help perform these functions.    

Moreover, due to the changes in market dynamics, modeling the older auto models is quite different from modeling the mid-age and newer models. For instance, the competition in the older-model market centers around the dealers and private sellers, gradually shifting to dealers vs. off-lease to dealers vs. rental companies to even dealer vs. manufacturers (offering large rebates for the unsold inventory before the new models are introduced), etc. for the newer models. Obviously, considering the significantly higher prices, the demand for financing becomes more critical for the newer models than the older ones.     

Finally, in keeping with the changes in the market dynamics, the following structured format will be resorted to: 2006 lines will represent the “older” models while 2010 and 2015 models will represent the “mid-age” and “newer” models, respectively. Therefore, in line with this prescribed format, the first three chapters will be devoted to these three models: 2006 Toyota Camry (older), 2010 Honda Accord (mid-age) and 2015 Nissan Altima (newer).

Modeling 2006 Toyota Camry

Camry has been one of Toyota’s bread and butter brands in the US since it was reintroduced as a wide-bodied mid-size car. Despite having gone through a number of design and body changes, it has been the best-selling passenger vehicle here since 1997, save a year or two in between. In fact, over 90% of them are still roaming the streets. Given this extraordinary success and achievement of this brand, it is only befitting that the 2006 Camry spearheads the modeling journey.

The 2006 Camry came in three primary trims (“packages”), Standard, LE and XLE, with LE leading the production and sales. XLE represented the top of the line with a V6 engine, sunroof, upgraded audio, leather and a luxury power pack. Roughly 450K Camry units were sold in 2006, beating all prior records.  

Modeling Step 1 (Correlation Matrix)

(Click on the image to enlarge)

The above correlation matrix sets the table for modeling. Dealer Price (abbreviated here as D/Price) has the highest (negative) correlation with Miles. The negative correlation coefficient signifies that higher mileage dampens asking prices in the market.

Prior ownership (“Owner”) is the next most important predictor of the dealer price. Owner is a linearized 3-category variable with 1-owner receiving the highest rating followed by two other categories: 2-3 owners and 4-5 owners.

Accident and Warranty are the two important binary variables, though the latter represents the dealer warranty as the original manufacturer’s warranty had long elapsed. The V6 Engine, Sunroof and Upgraded Audio are all part of the XLE package, hence the discernibly high multi-collinearity, thus forcing this prospective trio out of the modeling equation and leaving Package to stand on its own.

Modeling Step 2 (Multiple Regression Analysis)

The above MRA output confirms the transition of Miles from the negative predictive relationship to the negative contribution to the predicted price. Accident is the most valuable independent variable (highest t stat and lowest P-value), followed by Warranty, Package, Owner and Miles. The model R-square – 0.93646259 – is reasonably high, with potential for much higher if the model is rerun without outliers.

To interpret the model coefficients, consumers prefer cars that have not had reported damages and accidents. Also, the cars backed by some sort of dealer warranty are in higher demand (considering the 2006 model is 10+ years old now), while the one-owner cars are preferable to those owned by multiple people. Also, better packaged models – XLE and LE – are more sought after than the baseline Standard model. As expected, a typical buyer is expected to pay a lesser price for a car with higher mileage.   

Modeling Step 3 (Analysis of Model Estimates)

The above percentile graph shows that the dealer prices and model estimates are more or less similar up to the median (50th percentile), beyond which the model estimates start to curve down, indicating that the dealer prices on the long and outer ends of the curve are on average $400-$500 above the market. This additionally proves that the model estimates could help both consumers and dealers to quickly converge on the same page as these estimates are independently derived. This will also help private sellers as they tend to feel quite confused – often clueless – in making quantitative adjustments to the available comps (adjusting the comps to their subjects).  

While dealers are asking the highest average price for the standard model (Std), the Model nonetheless is predicting (“Model Est”) the lowest price for it, and then ascending in the proper order. The Model is pricing the standard model $1,300 lower than the average dealer price, thus sending clear warning signals to the potential buyers of the overpricing. This is the reason why a set of independent model estimates are so critically important to protect general consumers. By the same token, dealers would be alerted to the potential under-pricing of the XLEs. As indicated before, LEs – in line with the original production – comprise nearly 2/3rds of the modeling sample.


The Accident variable provides an excellent customer protection, safeguarding those who are particularly risk-averse. Again, the Model is predicting $1,500 higher (6,894 – 5,353) for the vehicles without any reported damages/accidents. Similarly, the dealers can benefit by using the $300 higher model estimates (6,894 – 6,600) to re-price their accident-free fleet as well.

Nowadays, vehicle data reports like CarFax* and AutoCheck* are readily available in the pre-owned market, instantly alerting buyers shopping on-site of many noteworthy issues like title, safety, accident, odometer, prior ownership, etc.     

While only 18% are able to buy with a dealer warranty, they are however $1,000 (7,724 – 6,755) ahead, advantageously. Conversely, those who are buying without any warranty are still overpaying, on average, $435 (6,398 – 5,963).

Again, by using the model estimates both parties could win. While dealers are overcharging 8 out of 10 customers when selling without warranty, they are however losing big by significantly undercharging when selling with warranty. Obviously, customers are willing to pay a hefty sum for the warranty to earn some peace of mind, that is, to avoid having to deal with a lemon from the get-go. Of course, the warranty services for older vehicles are generally more expensive than their newer counterparts’.       

The Model is revealing that the dealer prices for the lower mileage ones are overpriced while their higher mileage equivalents are somewhat under-priced. Therefore, having access to the model values would help the dealers price their inventory more accurately, without having to depend on the transposed prices between these two compensating groups. The lure of lower mileage vehicles is forcing consumers to pay an unwarranted premium which could be avoided if the model estimates were also published alongside the dealer prices.

Consumers are pleasantly benefiting from the dealers’ flawed pricing of single-owner units despite having lower than the average mileage. Of course, the dealers are making up those losses by overcharging in other two categories, particularly in the 4–5 ownership category. Unfortunately, these less desirable vehicles – high mileage, accident surviving, many prior owners, etc. – are often pushed on to the consumers with less-than-perfect credit by bundling sub-prime loans.

The above table proves the presence of location arbitrage. Midwest and West Coast markets are seriously overpriced while South and Northeast are somewhat under-priced. Of course, considering that Midwest and West Coast are not the typical Camry country, this price imbalance could be temporary, resulting from (temporary) shortage of supplies.

The Model is showing an aberration in Florida markets. Though the lower mileage is expected to bump prices up, the Model nonetheless is projecting the lowest average prices for Florida. While the demographic variables are absent in the Model, demography could be playing an invisible role here.

When the Model identifies the over-priced ones, it’s pointing to a silver lining, meaning “potential” savings. If the model values were to be reported alongside the dealer prices, buyers would immediately know the extent of potential savings. SL #1, 2, 3 and 11 are comparable autos, yet the dealer price range of $7K to $10K projects a spread of $3K. Granted #1 carries a dealer warranty and warranty services are more expensive for older models due to the significantly higher exposure, but it shouldn’t make a difference of $3K. The Model is however predicting a more reasonable range of $6K to $7.7K, thereby alerting dealers of the over-pricing and arming consumers with more negotiating power.

Alternatively, when the modeling process identifies the under-priced ones, it’s pointing to some “upfront” savings for the consumers. This is an area where dealers would be most benefitted if they were to subscribe to the model estimates. Of course, the dealer prices could consciously be lower as the minor negatives are not captured in the modeling database. Nevertheless, it would be a conscious decision on the dealer’s part to ask a lower (than the model estimate) price. While SL #1 through 7 are comparable (all high mileage) vehicles, the dealer price range of $3.5K to $5K is significantly lower than the model estimates of $5.5K to $6.7K. Then again, in the pre-owned market, dealers often factor in some psychological breakpoints in pricing their fleet. For instance, the dealer might have factored in one such psychological breakpoint (200K miles) while pricing out SL# 1.

To recap, in order to develop a statistically significant pricing model for the pre-owned market nationally, a spatially distributed modeling sample with a set of predictive variables is critical. Also, time-tested and stable research tools and techniques are equally important.  

Pricing Model for an Older Vehicle - A Ford Mustang GT Case Study

Ford Mustang GT (Model Year 2010) When it comes to affordable sports cars, Americans’ love affair with the Ford Mustang family knows...