We study a dynamic pricing problem with finite inventory and parametric uncertainty on the demand distribution. Products are sold during selling seasons of finite length, and inventory that is unsold at the end of a selling season, perishes. The goal of the seller is to determine a pricing strategy that maximizes the expected revenue. Inference on the unknown parameters is made by maximum likelihood estimation. We propose a pricing strategy for this problem, and show that the Regret - which is the expected revenue loss due to not using the optimal prices - after T selling seasons is O(log2(T)). Apart from a small modification, our pricing strategy is a certainty equivalent pricing strategy, which means that at each moment, the price is chosen that is optimal w.r.t. the current parameter estimates. The good performance of our strategy is caused by an endogenous-learning property: using a pricing policy that is optimal w.r.t. a certain parameter sufficiently close to the optimal one, leads to a.s. convergence of the parameter estimates to the true, unknown parameter. We also show an instance in which the regret for all pricing policies grows as log(T). This shows that ourupper bound on the growth rate of the regret is close to the best achievable growth rate.
|Publisher||University of Twente, Department of Applied Mathematics|