Zheng Wen - Publications

   

Papers In Progress

  • V. Dwaracherla, Z. Wen, I. Osband, X. Lu, S. M. Asghari, B. Van Roy, "Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping", submitted. [arXiv]

  • I. Osband, Z. Wen, S. M. Asghari, V. Dwaracherla, M. Ibrahimi, X. Lu, and B. Van Roy, "Epistemic Neural Networks", submitted. [arXiv]

  • Z. Wen, I. Osband, C. Qin, X. Lu, M. Ibrahimi, V. Dwaracherla, S. M. Asghari, and B. Van Roy, "From Predictions to Decisions: The Importance of Joint Predictive Distributions", submitted. [arXiv]

  • J. Zhou, B. Hao, Z. Wen, J. Zhang, W. W. Sun, "Stochastic Low-rank Tensor Bandits for Multi-dimensional Online Decision Making", submitted to JASA, first-round major revision. [arXiv]

  • P. Grigas, A. Lobos, Z. Wen, and K. Lee, "Optimal Bidding, Allocation, and Budget Spending for a Demand-Side Platform with Generic Auctions", submitted to Operations Research, first-round major revision. [SSRN]

  • X. Lu, B. Van Roy, V. Dwaracherla, M. Ibrahimi, I. Osband, and Z. Wen, "Reinforcement Learning, Bit by Bit", submitted to Foundations and Trends® in Machine Learning. [arXiv]

Journal Papers

  • [J7] Y. Chen, Z. Wen, and Y. Xie, "Dynamic Pricing in an Evolving and Unknown Marketplace", accepted by Management Science. [SSRN]

Book Chapters

  • [BC2] Z. Wen, "Reinforcement Learning", Chapter 2 of The Elements of Joint Learning and Optimization in Operations Management, edited by Xi Chen, Stefanus Jasin, Cong Shi, Springer. [Springer Link]

Full Conference Papers

  • [C45] I. Osband, Z. Wen, S. M. Asghari, V. Dwaracherla, B. Hao, M. Ibrahimi, D. Lawson, X. Lu, B. O'Donoghue, and B. Van Roy, "The Neural Testbed: Evaluating Predictive Distributions", accepted by NeurIPS 2022, New Orleans, LA. [arXiv]

  • [C44] C. Qin, Z. Wen, X. Lu and B. Van Roy, "An Analysis of Ensemble Sampling", accepted by NeurIPS 2022, New Orleans, LA. [arXiv]

  • [C43] I. Osband, Z. Wen, S. M. Asghari, V. Dwaracherla, X. Lu, and B. Van Roy, "Evaluating High-Order Predictive Distributions in Deep Learning", the 38th Conference on Uncertainty in Artificial Intelligence (UAI 2022), Eindhoven, the Netherlands. [arXiv][OpenReview]

  • [C38] T. Yu, B. Kveton, Z. Wen, R. Zhang, and O. J. Mengshoel, "Influence Diagram Bandits", Thirty-seventh International Conference on Machine Learning (ICML 2020), online conference, originally planned to be held at Vienna, Austria.

  • [C36] P. Perrault, J. Healey, Z. Wen, and M. Valko, "Budgeted Online Influence Maximization", Thirty-seventh International Conference on Machine Learning (ICML 2020), online conference, originally planned to be held at Vienna, Austria.

  • [C33] V. Dwaracherla, X. Lu, M. Ibrahimi, I. Osband, Z. Wen, and B. Van Roy, "Hypermodels for Exploration", the Eighth International Conference on Learning Representations (ICLR 2020), online conference, originally planned to be held at Addis Ababa, Ethiopia.

  • [C30] Y. Xue, Z. Wen, and X. Jiang, "The Optimal Reservation Price", accepted for presentation at
    • the 2020 Southern Finance Association (SFA) Annual Meeting, Online Conference.
    • the 59th Annual Southwestern Finance Association (SWFA) Conference, San Antonio, Texas, 2020.
    • the 2019 Southern Finance Association (SFA) Annual Meeting, Orlando, Florida, with title "The Optimal Price Trigger".
  • [C16] S. Katariya, B. Kveton, C. Szepesvari, C. Vernade, and Z. Wen, "Stochastic Rank-1 Bandits", the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), Fort Lauderdale, Florida, 2017. [arXiv]

  • [C3] Z. Wen, B. Kveton, B. Eriksson, and S. Bhamidipati, "Sequential Bayesian Search", International Conference on Machine Learning (ICML), Atlanta, Georgia, 2013. (acceptance rate: 24%)

Short Conference Papers

Workshop Papers & Other Publication

  • [W10] R. Zhang, C. Chen, Z. Gan, Z. Wen, W. Wang, and L. Carin, "Nested-Wasserstein Self-Imitation Learning for Sequence Generation", Deep Reinforcement Learning Workshop, NeurIPS 2019, Vancouver, Canada.

  • [W9] R. Zhang, C. Chen, Z. Gan, W. Wang, Z. Wen, and L. Carin, "Sequence Generation with a Guider Network", ICML'19 Workshop on Real-World Sequential Decision Making, Long Beach, CA.

  • [W8] R. Zhang, Z. Wen, C. Chen, and L. Carin, "Scalable Thompson Sampling via Optimal Transport", Infer to Control Workshop on Probabilistic Reinforcement Learning and Structured Control at NIPS 2018, Montréal, Canada.

  • [W7] A. Lobos, P. Grigas, Z. Wen, and K. Lee, "Optimal Bidding, Allocation and Budget Spending for a Demand Side Platform Under Many Auction Types", AdKDDTargetAd2018 Workshop at KDD 2018, London, United Kingdom. [arXiv] (Best Student Paper)

  • [W6] S. Li, Y. Abbasi-Yadkori, B. Kveton, S. Muthukrishnan, V. Vinay, and Z. Wen, "Offline Evaluation of Ranking Policies with Click Models", CausalML 2018, Stockholm, Sweden.

  • [W5] C. Vernade, B. Kveton, Y. Abbasi-Yadkori, M. Ghavamzadeh, and Z. Wen, "Rank-1 A/B Testing", Women in Machine Learning Workshop 2017.

  • [W4] P. Grigas, A. Lobos, Z. Wen, and K. Lee, "Profit Maximization for Online Advertising Demand-Side Platforms", AdKDDTargetAd2017 Workshop at KDD 2017, Halifax, Canada. [arXiv]

Unpublished Preprints

  • B. Kveton, S. Mahdian, S. Muthukrishnan, Z. Wen, and Y. Xian, "Waterfall Bandits: Learning to Sell Ads Online", unpublished preprint. [arXiv]

  • S. Vaswani, B. Kveton, Z. Wen, A. Rao, M. Schmidt, and Y. Abbasi-Yadkori, "New Insights into Bootstrapping for Bandits", unpublished preprint. [arXiv]

  • B. Kveton, C. Szepesvari, A. Rao, Z. Wen, Y. Abbasi-Yadkori, and S. Muthukrishnan, "Stochastic Low-Rank Bandits", unpublished preprint. [arXiv]

  • B. Kveton, Z. Wen, A. Ashkan, and M. Valko, "Learning to Act Greedily: Polymatroid Semi-Bandits", unpublished preprint. [arXiv]

Code

  • The Neural Testbed. This is a system for evaluating the performance of epistemic neural networks, which are models that generate joint predictions over multiple outputs in response to multiple inputs. We have also included a set of baseline agents.

Patents & Filed Patents

  • [P10] Efficient Exploration of Offline Models to Warm Start Online Bandit Learning, with G. Theocharous, Y. Abbasi-Yadkori, and Q. Wu, US Patent App., filed, 2019.

  • [P9] Multi-Task Equidistant Embedding, with H. Zhao, S. Kim, S. Li and B. Kveton, US Patent App., filed, 2018.

  • [P8] Change Point Detection in a Multi-Armed Bandit Recommendation System, with Y. Cao and B. Kveton, US Patent App., filed, 2018.

  • [P7] Online Training of Segmentation Model via Interactions With Interactive Computing Environment, with T. Yu, B. Kveton, and H. Bui, US Patent App., filed, 2018.

  • [P6] Multivariate Digital Campaign Content Testing Utilizing Rank-1 Best-Arm Identification, with Y. Abbasi-Yadkori, M. Ghavamzadeh, B. Kveton, and C. Vernade, US Patent App., filed, 2018.

  • [P5] Training And Utilizing Item-level Importance Sampling Models for Offline Evaluation and Execution of Digital Content Selection Policies, with S. Li, Y. Abbasi-Yadkori, B. Kveton, and V. Vinay, US Patent App., filed, 2018.

  • [P4] Online Diverse Set Generation From Partial-Click Feedback, with B. Kveton, P. Gupta, I. A. Burhanuddin, H. Singh, and G. Hiranandani, US Patent App., filed, 2018.

  • [P3] Influence Maximization Determination in a Social Network System, with S. Vaswani, B. Kveton, and M. Ghavamzadeh, US Patent App., filed, 2017.

Theses