Wednesday, January 10, 2018

Apply Data Science to the Pre-owned Auto Market - A Mercedes-Benz C-300 Case Study

Mercedes-Benz C-300 -- Model Year 2012

Car enthusiasts around the world recognize Mercedes-Benz (“MB”) as one of the best-engineered brands of all time. Its durability is unsurpassed. Its design engineering is unrivaled. Its reliability is matchless. Its road traction is legendary. Its safety is trend-setting. Consequently, its loyalty beggars description.

Though C-300 is MB’s baseline luxury model, it has the entire range of bells and whistles one expects in an upscale luxury version under $50K – from 228hp V6 to 4MATIC AWD to power sunroof to leatherette to dual climate to eight airbags to eight speakers to Bluetooth, etc., in addition to 48mo/50K bumper-to-bumper warranty.


Modeling Step 1 (Correlation Matrix)


(Click on the image to enlarge)

The above correlation matrix (1) is unrestricted, representing the entire database without any constraints whatsoever. As usual, the Dealer Price (Dealer Pr) has the highest (negative) correlation with Miles, signifying that the higher miles tend to dampen the dealers’ asking prices in the market.

Warranty is the next best predictive variable, with extremely low collinearity with the other potential independent variables. The positive coefficient points to buyer’s preference of the warranted vehicles over their counterparts with expired warranties. Unsurprisingly, Accident has a negative impact on dealer prices, while well-maintained cars (Service) are rewarded. Surprisingly, the highest-rated original ownership is unrewarded. The plausible conclusion is that the 3-yr leases are so popular in the mini world of upscale vehicles that the first owner is truly the first purchaser after the expiration of the lease.




The correlation matrix (2) is however constrained to those with the balance of the factory re-certified (Certified) warranty or special dealer warranty.  Dealer warranties are generally sales promotion measures, hence quite short-lived, often ranging between 30-day/1,000 miles and 90-day/3,000 miles, although some 1-year/limited to unlimited miles are not uncommon. While Miles and Warranty are still projecting normal relationships with the Dealer Price, Accident and Owner have switched sides. 

Given the limited period and the resulting capped mileage, the accident variable has become irrelevant (uncorrelated). On the other hand, the Owner variable has turned into a big positive now, with an equally negative relationship with Miles, suggesting that the original ownership would be rewarded as long as fewer (than normal) miles are driven. Meanwhile, Service has become the most positive correlation coefficient, emphasizing that the maintenance of the vehicle by the manual would be economically wholesome as well.




The above scatter graph depicts the usual negative relationship between the Dealer Prices and Miles. Prices generally decrease commensurately with the increasing mileage. Thus far, this graph has demonstrated the most classic relationship; nonetheless, this fit would be much tighter with the trimming of some outliers, thus paving the way for a much higher R-square, perhaps to a more customary level. 


Modeling Step 2 (Multiple Regression Analysis)




The model R-square – 0.9681076 – is reasonably high, with potential for even higher R-square if the model is rerun without the outliers.

The above MRA output confirms the negative contributory relationship between Miles and the dependent variable, meaning higher miles are negatively contributing to the predicted prices. Though the Miles coefficient is seemingly small, it will nonetheless have reasonable impact on cars with high mileage; for example, the predicted price of a 2012 C-300 with 100,000 miles will be reduced by -$2,238 (-0.022380 * 100,000), as opposed to a mere -$45 for a competing one with only 20,000 miles on it.

Accident is the most important independent variable (highest t stat and lowest P-value) in the model followed by Service, Owner and Warranty. Again, the standout presence of Accident points to the fact that it provides the maximum price differentiation between the accident-free and accident-encountered groups. Simply put, the future owners of these fairly late model upscale cars are most likely risk-averse, and as a result are unwilling to pay the high market price either for the high mileage cars or those cars that have encountered major damages or accidents.

To interpret the other Model coefficients, the well-maintained cars provide the necessary peace of mind for the buyers as they are willing to accept good service (service maintenance by the manual, garage kept, etc.) in lieu of the expired warranty (the original 4-year factory warranty period for these cars has in 2016). Additionally, the single ownership and Certified/Dealer warranty are greatly favored in retaining high value. Of course, higher miles and accidents have perceptibly negative impacts on predicted prices.    


Modeling Step 3 (Analysis of Model Estimates)




The above percentile graph shows that while the model estimates are significantly lower at the bottom end of the curve (up to the 25th), they are however in tandem on the long end of the curve, though diverging at the outer end (> 90th). The fact that the model has been predicting lower prices at the bottom end of the curve points to the above-the-market asking prices for the lower end units, perhaps those with multiple incidences – accidents on record, multiple ownership, inadequate maintenance or high mileage.

On the other hand, this additionally proves that the model estimates could help both consumers and dealers to quickly converge on the same page as these estimates are independently derived. Likewise, the private sellers can validate their subject prices before accepting the trade-in values from the dealers, considering that these are upscale vehicles and local comps could be few and far between.    




The above Warranty table shows that the dealers are over-charging by packaging vehicles with in-house (Dealer) warranties. On the contrary, they are significantly under-pricing – roughly $3,000 on average – the factory Certified vehicles. Vehicles without any warranty (None) are the least preferred, thus fetching -$2,000, either way. Interestingly, the vehicles with dealer warranty have far fewer miles than those with expired warranty.        




While the Model is confirming the dealer pricing for the fleet without any reported damages/accidents, the dealers are however way over-pricing the ones with the reported accidents. Therefore, by having the model estimates placed alongside the dealer prices, consumers could save on average $5,229 (18,449 – 13,220), a truly wow savings and a great firewall protection from dealers’ over-pricing.




The Model also proves that the one owner vehicles are expected to fetch premium prices, though the dealers are pricing the multiple ownership cars higher than the original owners’. Ironically, the 3-owner vehicles are priced the highest, roughly $2,650 above the market, while the 1-owner ones are the lowest, roughly $2,500 below the market. Again, this transposition of prices could be avoided if they were to subscribe to some legitimate model values. Moreover, by having the model estimates available side-by-side the dealer prices, dealers would protect average consumers and ease deal-making by eliminating all unnecessary price haggling back and forth.





Though the dealers are asking more or less the same price irrespective of the level of owner’s maintenance history, the Model is nevertheless urging a much higher price – in fact, over $3,000 – for the better maintained vehicles. Needless to say, absent model values alongside dealer prices, buyers would way overpay for the lesser maintained vehicles, proving the need to reinvent this industry by requiring a set of independent values to simultaneously coexist.




The above Miles table illustrates the price comparison by having miles broken down into four (equal) quartiles. While the Model is predicting very similar prices across all quartiles, dealers are asking significantly higher prices for the low mileage cars. Dealer prices in the high mileage quartile are somewhat lower than the model estimates. Unfortunately, the lure of the low mileage vehicles is forcing consumers to pay an unwarranted premium which could be avoided if the model estimates were also published alongside the dealer prices.




The Model is proving that the location arbitrage is virtually non-existent nationally, though the West Coast dealers are asking quite a bit higher prices than the national average. According to the Model, the West Coast prices are, on average, $900 above the real market. Of course, a normalized metric (Model Est/Miles) is an alternate way to standardize and evaluate this scenario.     




The above data sample demonstrates that the dealers are way over-pricing the cars with reported accidents. The Model, on the other hand, is heavily discounting those cars. The spread is widest when stacked with lack of warranty, multiple owners and less than perfect maintenance (e.g., SL # 2, 5 and 8). If the model estimates were to be placed alongside the dealer prices, buyers would immediately know the extent of the over-pricing.




When the modeling process identifies the under-priced cars, it’s pointing to some “upfront” savings for the consumers. This is an area where the dealers would be most benefited if they were to subscribe to some model estimates. The above sample shows that the dealers are generally under-pricing the accident-free vehicles, with the widest spreads attributed to those that are additionally backed by warranty, single ownership and excellent service history.

As indicated before, now and then, the dealer prices could consciously be lower to factor in some minor negatives (rarely captured in the modeling database) or to address some imminent psychological breakpoints. SL # 6 could be one such psychological case where the dealer might have consciously lowered the price (the lowest price) as the vehicle is on the verge of reaching 100K miles, a big psychological milestone.

Considering the popularity of the Mercedes-Benz C-300 series, the 2012 model has become one of the most sought after mid-size luxury late models on the market today.

No comments:

Post a Comment

A "Quick Look" Auto Valuation Site must be Mobile-friendly, Working as an App as well

How Mobile-friendly JustAutoValue.com   Looks and Works as an App on iPhone http://www.justautovalue.com/ Most Websites are ...