How sharing what we don’t know can build trust in medical research

Impactful leadership in the era of the remote team

How credit cards activate the reward center of our brains and drive spending

MIT Sloan Experts


How to produce cleaner data for robust pricing


These days, a lot of people are talking about using data to make decisions. However, an often- overlooked aspect of data is that it can’t always be trusted. Bad data leads to bad decisions. Garbage in – garbage out.

Negin Golrezaei, KDD Career Development Professor in Communications and Technology & Assistant Professor of Operations Management, MIT Sloan

A common reason for the data being garbage is that is generated by strategic players. This is common in a two-sided marketplace, where you have the buyers on one side and sellers on the other. Every action by the buyers – like deciding whether to purchase something now or wait until an upcoming sale – influences the data. Sellers take data generated from the buying side to optimize operational decisions like pricing. But what happens when the buyers are aware of this and change their behavior to influence the seller to lower prices? Now, the data is generated by buyers who are incentivized to manipulate pricing.

This frequently happens in online advertising markets, where sellers run large numbers of auctions for ad views. The buyers are advertisers who purchase millions of ad views in a given day, leading to frequent interactions with the sellers. The buyers know their bids are used to set future prices, so they can act strategically to lower their bids. By gaming the system, the data appears to set a lower valuation on the ad views, leading to lower prices.

I looked at this problem with Adel Javanmard of the University of Southern California and Vahab Mirrokni of Google Research and found that there are ways to limit this price manipulation. The key is the pricing algorithm. Instead of using bids to directly set prices, we designed an algorithm that uses censored bids– in this case a binary signal – to indicate whether the buyer wins in the prior auction or not.

Why does this work? Buyers need to change the binary signal to manipulate future prices. This means that they need to change the outcome of the auctions to influence the future prices, where such a change is costly for them, as it can lead to losing the current auction. Thus, by using the binary signal, we make it costly for the buyer to do any manipulation.

As manipulation decreases, sellers have access to cleaner date, which leads to better data-driven decision-making.

This is good news for sellers because they can design an algorithm to generate clean data. We showed this with advertising, but this model can work in other marketplaces too. Financial markets and stock markets are similar in that buyers’ behavior can influence price. The algorithm should likewise improve robust pricing in those situations.

The bottom line is that while you shouldn’t fully trust data, you can use an algorithm to censor the data to improve its reliability. It’s not a foolproof solution, but it is a good place to start to clean up the data and make more effective data-driven decisions.

MIT Sloan School of Management Prof. Negin Golrezaei is coauthor of “Dynamic incentive-aware learning: Robust pricing in contextual auctions.” The paper has been accepted for publication by the Operations Research Journal.

For more info Negin (Nicki) Golrezaei (617) 715-2622