Updated: How much should you bid for Phase 3 of the FM Auction?

FM Companies: Here’s the updated cheat sheet for bidding at phase 3 of the FM auctions.

This updated model has tighter ranges and is, in my opinion, more accurate. I used Facebook’s ad reach potential for each city as a predictor of its license fee to build a linear regression model. Predictions here are based on that model. If you’re interested, here‘s the code for that model.

This Facebook fueled model beats the franchise based model I built a week ago for two reasons. First, Facebook’s ad reach potential is a better proxy for a city’s advertising potential than franchise data. Advertising potential is directly related to how much companies will spend on advertising their products on radio, which, in turn, is directly related to the price at which an FM station gets sold.

Second, the Facebook ad potential helped my predictions get more granular. Two cities, no matter how similar they might be, rarely get sold at the same price. Using franchises as a predictor didn’t provide my data with this price uniqueness. I only ended up with many buckets of similarly priced cities. If a city had 1 Cafe Coffee Day and 1 Domino’s pizza store, my model predicted it to sell at the exact same price as every other city with 1 Cafe Coffee Day and 1 Domino’s pizza store. That’s still a good approximation, but my facebook fueled model allows me to go beyond that approximation.

For a 60% chance of winning the auction for a city, bid the amount in the second column. For an 80% chance, bid the amount in the third column. For an almost-certain 95% chance, bid the amount in the fourth column.

A third of the cities up for auction in the last round of Phase 3 of the auctions went unsold. The cities highlighted in the table below are my predictions for which cities are likely to go unsold in this round.

Ask yourself how badly you want the frequency of a particular city. Refer to the table to find the value that matches your priority for that city. Bid that amount. (All figures in lakhs of rupees.)

City 60% chance 80% chance 95% chance Reserve Price
Achalpur 143 249 416 171
Agartala 279 383 546 16
Aizwal 308 411 573 12
Akola 213 318 483 30
Alappuzha 351 454 615 702
Amravati 259 363 527 351
Asansol 331 433 596 194
Barshi 152 258 424 171
Belgaum 323 426 588 702
Bellary 213 318 483 702

Click here to view and download the table.

If you’re very eager on buying the frequency for Achalpur, you’re best off bidding the amount in the 95% chance column. Bidding at 416 lakhs gives you an almost certain chance of clearing the round with the frequency in hand.

Suppose you’re not as confident about the frequency for Agartala; you’re not as desperate to go out and get it. You decide that you want to bid an amount that will give you a 60% chance of winning the auction for Agartala’s FM Station. Bidding at 279 lakh is your best bet.

A quick note about reserve prices: the higher it is compared to the 60% chance bid, the lower the probability of the frequency selling. For Achalpur, the reserve price (171 lakh) is above the 60% bid (143 lakh). For Agartala, however, there’s some distance between the reserve price (16 lakh) and the 60% bid (279 lakh). This means that the FM station for Agartala will sell like hot cakes but that for Achalpur won’t. Cities highlighted in the table are likely go unsold.

I hope this table was valuable to you. Good luck at the auctions!

Building the regression models behind my FM predictions

In my last post, I said that I’d explain the analysis behind the conclusions in that post. This is where I’ll be writing about it. This post is a bit technical – it’s all about the models I built and the process behind building them, so it might only appeal to data geeks out there. It’s quite tedious, so I’ll jump right into it. If you’re looking for the data and statistics behind my predictions post, I’ll be writing about that soon.

I collected the data of past successful bids here, data for reserve prices here, data for cities available for auction here, and the store location data for Domino’s, Cafe Coffee Day and Hero MotoCorp from their websites. Right off the bat, there’s a lot of data and a lot of data rarely comes clean.

Cleaning the data proved to pose two main challenges. First was the problem of resolution – do I take Delhi, Gurgaon, Noida, New Delhi, Faridabad, Ghaziabad as one city or not? Nearly every large city suffered from this resolution problem. Eventually, I decided to resolve it to the same level as it was in the FM list – if they were selling a separate frequency for Gurgaon (they aren’t), I’d consider it separately from Delhi. Otherwise, it’d be a part of Delhi. I know it seems intuitive in hindsight but it took me some time to come by this. 

The second challenge was straight-up bad data. There’s four ways to spell Trichy, the others being Tiruchirapali, Tiruchurappalli and Tiruchy – and every franchise had their own way of spelling it. I tried eliminating the vowels in city names to create a common spelling-list that my data could safely rest upon, but that fell apart when my program encountered cities like Cuddapah (Kadapa) and Calicut (Kozhikode). I had to resolve these anomalies one at a time; each city really has its own challenges. After days of cleaning data, I was finally ready to build my model.

Here’s the code for my model in case you’re interested.

fm.df <- read.csv("Analysis.csv")               #read file 

#create training dataset
fmtrain.df <- subset(fm.df, !is.na(LicenseFee))

#create test dataset
fmtest.df <- subset(fm.df, is.na(LicenseFee))   

#build regression models
fm.1 <- lm(LicenseFee ~ Category, data = fmtrain.df)
fm.2 <- lm(LicenseFee ~ Category + Dominos, data = fmtrain.df)
fm.3 <- lm(LicenseFee ~ Category + CCD, data = fmtrain.df)
fm.4 <- lm(LicenseFee ~ Category + Hero, data = fmtrain.df)
fm.5 <- lm(LicenseFee ~ Category + CCD + Dominos + Hero, data = fmtrain.df)

minmax <- as.data.frame(predict(fm.1, fmtest.df, interval = 'confidence', level = 0.6)) #prepare limit for 60% confidence
df60 <- cbind(fmtest.df$City, minmax)
df60 <- subset(df60, select = c("fmtest.df$City", "upr"))

minmax <- as.data.frame(predict(fm.1, fmtest.df, interval = 'confidence', level = 0.8)) #prepare limit for 80% confidence
df80 <- cbind(fmtest.df$City, minmax)
df80 <- subset(df80, select = c("fmtest.df$City", "upr"))

minmax <- as.data.frame(predict(fm.1, fmtest.df, interval = 'confidence', level = 0.95)) #prepare limit for 95% confidence
df95 <- cbind(fmtest.df$City, minmax)
df95 <- subset(df95, select = c("fmtest.df$City", "upr"))

bizcon.fm5 <- cbind(df60, df80, df95) #return upper limits for specified confidence intervals
write.csv(bizcon.fm5, "bizcon.fm5.csv") #save file

First, I split my data into two groups – a training set (the prices I had from the round 1 auctions) and a test set (the questions I had to answer using my model). There’s a convenient function in R, lm, which builds a linear regression model. I had four tools to build my model – Domino’s, Cafe Coffee Day, Hero MotoCorp and the ‘Category’ of the city. Adding an extra variable always improved my model a little bit (I found this out by calculating the R value for every model), but I couldn’t use all of them at once. Using every variable in the mix would mean that unless a city had all four variables, it’d return a blank. I wouldn’t be able to predict for 159 of 248 cities in my dataset.

So I built 5 models (it’s in the code) and ran the program for all of them. The program’s job is to find the upper limit for each confidence level for every regression model. Here‘s a good source you could check out if you’re not entirely sure what a regression model is.

Then I averaged out what each model gave me. So if a city had a ‘Category’, a Domino’s store and a Hero dealership, it would run through all models which relied on these three variables (the models being fm.1, fm.2, fm.4). Then, I summed the upper limits each model returned and divided this sum by 5 (for every confidence interval) to place it on a common scale with the rest of the cities. 

This gave me just what I needed. A price that a city should be bid at based on how desperately a company wanted the frequency of that city. That’s how I ended up with my figures for each confidence interval and my bidding guide – the FM auction cheat sheet.

Stay tuned for a super interesting model based on new data coming up this weekend.