Zheng Wen - Publications

Journal Papers

[J12] J. Zhou, B. Hao, Z. Wen, J. Zhang, W. W. Sun, "Stochastic Low-rank Tensor Bandits for Multi-dimensional Online Decision Making", Journal of the American Statistical Association (JASA), published online. [arXiv]

[J11] Y. Chen, Z. Wen, and Y. Xie, "Dynamic Pricing in an Evolving and Unknown Marketplace", accepted by Management Science. [SSRN]

[J10] Z. Yu, J. Zhang, Z. Wen, A. Tacchetti, M. Wang, I. Gemp, "Teamwork Reinforcement Learning with Concave Utilities", IEEE Transactions on Mobile Computing, Vol. 23, Issue 5, pp 5709-5721, May 2024.

[J9] B. Hao, R. Jain, D. Tang, and Z. Wen, "Bridging Imitation and Online Reinforcement Learning: An Optimistic Tale", Transactions on Machine Learning Research, October 2023. [arXiv]

[J8] V. Dwaracherla, Z. Wen, I. Osband, X. Lu, S. M. Asghari, and B. Van Roy, "Ensembles for Uncertainty Estimation: Benefits of Prior Functions and Bootstrapping", Transactions on Machine Learning Research, May 2023. [arXiv]

[J7] X. Lu, B. Van Roy, V. Dwaracherla, M. Ibrahimi, I. Osband, and Z. Wen, "Reinforcement Learning, Bit by Bit", Foundations and Trends® in Machine Learning, Vol. 16: No. 6, pp 733-865. [arXiv]

[J6] I. Osband, B. Van Roy, D. Russo, and Z. Wen, "Deep Exploration via Randomized Value Functions", Journal of Machine Learning Research, Vol. 20, No. 124, pp 1-62, 2019.
- Conference Paper [C13] is a preliminary version of part of this paper

[J5] D. Russo, B. Van Roy, A. Kazerouni, I. Osband, and Z. Wen, "A Tutorial on Thompson Sampling", Foundations and Trends® in Machine Learning, Vol. 11, No. 1, pp 1-96, 2018. [arXiv]

[J4] Z. Wen and B. Van Roy, "Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization", Mathematics of Operations Research, Vol. 42, Issue 3, August 2017, pp. 762–782. [arXiv]
- Some theoretical results of this paper build on and add to those reported in Conference Paper [C5]

[J3] Z. Wen, D. O'Neill, and H. Maei, "Optimal Demand Response Using Device Based Reinforcement Learning", IEEE Transactions on Smart Grid, Vol. 6, No. 5, September 2015. [arXiv]

[J2] Z. Wen, S. Roy, and A. Saberi, "On the Dynamic Response of a Saturating Static Feedback-Controlled Single Integrator Driven by White Noise", IEEE Transactions on Automatic Control, Vol. 55, Issue 4, April 2010.

[J1] Z. Wen, S. Roy, and A. Saberi, "On the Disturbance Response and External Stability of a Saturating Static-Feedback-Controlled Double Integrator", Automatica, Vol. 44, Issue 8, August 2008.
- Conference Paper [C1] is a preliminary version of this paper

Book Chapters

[BC2] Z. Wen, "Reinforcement Learning", Chapter 2 of The Elements of Joint Learning and Optimization in Operations Management, edited by Xi Chen, Stefanus Jasin, Cong Shi, Springer. [Springer Link]

[BC1] Z. Wen, L. Durlofsky, B. Van Roy, and K. Aziz, "Approximate Dynamic Programming for Optimizing Oil Production", Chapter 25 of Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, edited by F. Lewis and D. Liu, John Wiley & Sons, Inc., December 2012. [PDF]
- Conference Paper [C2] is a preliminary version of part of this book chapter

Papers In Progress

W. Xu, S. Dong, X. Lu, G. Lam, Z. Wen, and B. Van Roy, "RLHF and IIA: Perverse Incentives", submitted. [arXiv]

D. Tang, R. Jain, B. Hao, and Z. Wen, "Efficient Online Learning with Offline Datasets for Infinite Horizon MDPs: A Bayesian Approach", to be submitted. [arXiv]

Z. Wen, I. Osband, C. Qin, X. Lu, M. Ibrahimi, V. Dwaracherla, S. M. Asghari, and B. Van Roy, "From Predictions to Decisions: The Importance of Joint Predictive Distributions", to be submitted. [arXiv]

"The Target Return Strategy", with Y. Xue and X. Jiang, submitted to The Financial Review, minor revision.

P. Grigas, A. Lobos, Z. Wen, and K. Lee, "Optimal Bidding, Allocation, and Budget Spending for a Demand-Side Platform with Generic Auctions", submitted to Production and Operations Management, first-round major revision. [SSRN]

Full Conference Papers

[C48] I. Osband, Z. Wen, S. M. Asghari, V. Dwaracherla, M. Ibrahimi, X. Lu, and B. Van Roy, "Epistemic Neural Networks", accepted by the 37th Conference on Neural Information Processing Systems (NeurIPS 2023) as a Spotlight, New Orleans, LA. [arXiv]

[C47] I. Osband, Z. Wen, S. M. Asghari, V. Dwaracherla, M. Ibrahimi, X. Lu, and B. Van Roy, "Approximate Thompson Sampling via Epistemic Neural Networks", 39th Conference on Uncertainty in Artificial Intelligence (UAI 2023), Pittsburgh, PA. [arXiv]

[C46] B. Hao, R. Jain, T. Lattimore, B. Van Roy, and Z. Wen, "Leveraging Demonstrations to Improve Online Learning: Quality Matters", the 40th International Conference on Machine Learning (ICML 2023), Honolulu, Hawaii. [arXiv][OpenReview]

[C45] I. Osband, Z. Wen, S. M. Asghari, V. Dwaracherla, B. Hao, M. Ibrahimi, D. Lawson, X. Lu, B. O'Donoghue, and B. Van Roy, "The Neural Testbed: Evaluating Predictive Distributions", the 36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, LA. [arXiv][OpenReview]

[C44] C. Qin, Z. Wen, X. Lu and B. Van Roy, "An Analysis of Ensemble Sampling", the 36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, LA. [arXiv][OpenReview]

[C43] I. Osband, Z. Wen, S. M. Asghari, V. Dwaracherla, X. Lu, and B. Van Roy, "Evaluating High-Order Predictive Distributions in Deep Learning", the 38th Conference on Uncertainty in Artificial Intelligence (UAI 2022), Eindhoven, the Netherlands. [arXiv][OpenReview]

[C42] P. Xu, Z. Wen, H. Zhao, Q. Gu "Neural Contextual Bandits with Deep Representation and Shallow Exploration", the Tenth International Conference on Learning Representations (ICLR 2022), virtual conference.

[C41] A. Lobos, P. Grigas, and Z. Wen, "Joint Online Learning and Decision-making via Dual Mirror Descent", the Thirty-eighth International Conference on Machine Learning (ICML 2021), online conference.

[C40] Z. Wen, D. Precup, M. Ibrahimi, A. Barreto, B. Van Roy, and S. Singh, "On Efficiency in Hierarchical Reinforcement Learning", accepted by the Thirty-fourth Conference on Neural Information Processing Systems (NeurIPS 2020) as a Spotlight, online conference. [NeurIPS Videos]

[C39] J. Shi, N. Xu, T. Bui, F. Dernoncourt, Z. Wen, C. Xu, "A Benchmark and Baseline for Language-Driven Image Editing", Asian Conference on Computer Vision 2020 (ACCV 2020), online conference.

[C38] T. Yu, B. Kveton, Z. Wen, R. Zhang, and O. J. Mengshoel, "Influence Diagram Bandits", Thirty-seventh International Conference on Machine Learning (ICML 2020), online conference, originally planned to be held at Vienna, Austria.

[C37] Y. Park, R. Rossi, Z. Wen, G. Wu, and H. Zhao, "Structured Policy Iteration for Linear Quadratic Regulator", Thirty-seventh International Conference on Machine Learning (ICML 2020), online conference, originally planned to be held at Vienna, Austria.

[C36] P. Perrault, J. Healey, Z. Wen, and M. Valko, "Budgeted Online Influence Maximization", Thirty-seventh International Conference on Machine Learning (ICML 2020), online conference, originally planned to be held at Vienna, Austria.

[C35] R. Zhang, C. Chen, Z. Gan, W. Wang, D. Shen, G. Wang, Z. Wen, and L. Carin, "Improving Adversarial Text Generation by Modeling the Distant Future", the 58th annual meeting of the Association for Computational Linguistic (ACL 2020), online conference, originally planned to be held at Seattle, Washington.

[C34] R. Zhang, C. Chen, Z. Gan, Z. Wen, W. Wang, and L. Carin, "Nested-Wasserstein Self-Imitation Learning for Sequence Generation", the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS 2020), online conference, originally planned to be held at Palermo, Sicily, Italy. [arXiv]
- Workshop Paper [W10] is a preliminary version of this paper

[C33] V. Dwaracherla, X. Lu, M. Ibrahimi, I. Osband, Z. Wen, and B. Van Roy, "Hypermodels for Exploration", the Eighth International Conference on Learning Representations (ICLR 2020), online conference, originally planned to be held at Addis Ababa, Ethiopia.

[C32] S. Li, W. Chen, Z. Wen, and K. Leung, "Stochastic Online Learning with Probabilistic Graph Feedback", the Thirty-Fourth AAAI Conference on Artificial Intelligence (AAAI 2020), New York, New York. [arXiv]

[C31] B. Hao, Y. Abbasi-Yadkori, Z. Wen, and G. Cheng, "Bootstrapping Upper Confidence Bound", the Thirty-third Conference on Neural Information Processing Systems (NeurIPS 2019), Vancouver, Canada. [arXiv]

[C30] Y. Xue, Z. Wen, and X. Jiang, "The Optimal Reservation Price", accepted for presentation at
- the 2020 Southern Finance Association (SFA) Annual Meeting, Online Conference.
- the 59th Annual Southwestern Finance Association (SWFA) Conference, San Antonio, Texas, 2020.
- the 2019 Southern Finance Association (SFA) Annual Meeting, Orlando, Florida, with title "The Optimal Price Trigger".

[C29] H. Singh, G. Hiranandani, P. Gupta, I. Ahamath Burhanuddin, Z. Wen, and B. Kveton, "Cascade Linear Submodular Bandits: Accounting for Position Bias and Diversity in Online Learning to Rank", the Conference on Uncertainty in Artificial Intelligence (UAI) 2019, Tel Aviv, Israel.

[C28] B. Kveton, C. Szepesvari, S. Vaswani, Z. Wen, M. Ghavamzadeh, and T. Lattimore, "Garbage In, Reward Out: Bootstrapping Exploration in Multi-Armed Bandits", the Thirty-sixth International Conference on Machine Learning (ICML 2019), Long Beach, California. [arXiv]

[C27] R. Zhang, Z. Wen, C. Chen, C. Fang, T. Yu, and L. Carin, "Scalable Thompson Sampling via Optimal Transport", Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019), Naha, Okinawa, Japan.
- Workshop Paper [W8] is a preliminary version of this paper

[C26] S. Katariya, B. Kveton, Z. Wen, and V. K. Potluru, "Conservative Exploration using Interleaving", Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019), Naha, Okinawa, Japan. [arXiv]

[C25] Y. Cao, Z. Wen, B. Kveton, and Y. Xie, "Nearly Optimal Adaptive Procedure for Piecewise-Stationary Bandit: a Change-Point Detection Approach", Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS 2019), Naha, Okinawa, Japan. [arXiv]

[C24] G. Theocharous, Z. Wen, Y. Abbasi-Yadkori, and N. Vlassis, "Scalar Posterior Sampling with Applications", Advances in Neural Information Processing Systems 31 (NeurIPS 2018) Proceedings, Montreal, Canada.

[C23] T. Yu, B. Kveton, Z. Wen, H. Bui, and O. J. Mengshoel, "SpectralFPL: Online Spectral Learning for Single Topic Models", The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) 2018, Dublin, Ireland. [arXiv]

[C22] S. Li, Y. Abbasi-Yadkori, B. Kveton, S. Muthukrishnan, V. Vinay, and Z. Wen, "Offline Evaluation of Ranking Policies with Click Models", 24th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2018) Proceedings, London, United Kingdom. [arXiv]
- Workshop Paper [W6] is a preliminary version of this paper

[C21] Z. Wen, B. Kveton, M. Valko, and S. Vaswani, "Online Influence Maximization under Independent Cascade Model with Semi-Bandit Feedback", Advances in Neural Information Processing Systems 30 (NIPS 2017) Proceedings, Long Beach, California. [Poster][Adobe Research News]

[C20] M. Zoghi, T. Tunys, M. Ghavamzadeh, B. Kveton, C. Szepesvari, and Z. Wen, "Online Learning to Rank in Stochastic Click Models", International Conference on Machine Learning (ICML), Sydney, Australia, 2017. [arXiv]

[C19] S. Vaswani, B. Kveton, Z. Wen, M. Ghavamzadeh, L. Lakshmanan, and M. Schmidt, "Diffusion Independent Semi-Bandit Influence Maximization", International Conference on Machine Learning (ICML), Sydney, Australia, 2017. [arXiv]

[C18] S. Katariya, B. Kveton, C. Szepesvari, C. Vernade, and Z. Wen, "Bernoulli Rank-1 Bandits for Click Feedback", 26th International Joint Conference on Artificial Intelligence (IJCAI), Melbourne, Australia, 2017. [arXiv]

[C17] S. Zong, B. Kveton, S. Berkovsky, A. Ashkan and Z. Wen, "Get to the Bottom: Causal Analysis for User Modeling", International Conference on User Modelling, Adaptation and Personalization (UMAP), Bratislava, Slovakia, 2017. (Best Paper Candidate)
- Short Conference Paper [SC2] is a preliminary version of part of this paper

[C16] S. Katariya, B. Kveton, C. Szepesvari, C. Vernade, and Z. Wen, "Stochastic Rank-1 Bandits", the 20th International Conference on Artificial Intelligence and Statistics (AISTATS), Fort Lauderdale, Florida, 2017. [arXiv]

[C15] S. Zong, H. Ni, K. Sung, N. Ke, Z. Wen, and B. Kveton, "Cascading Bandits for Large-Scale Recommendation Problems", Conference on Uncertainty in Artificial Intelligence (UAI), New York City, New York, 2016. [arXiv] (acceptance rate 31%)

[C14] S. Katariya, B. Kveton, C. Szepesvari, and Z. Wen, "DCM Bandits: Learning to Rank with Multiple Clicks", International Conference on Machine Learning (ICML), New York City, New York, 2016. [arXiv] (acceptance rate 24%)

[C13] I. Osband, B. Van Roy and Z. Wen, "Generalization and Exploration via Randomized Value Functions", International Conference on Machine Learning (ICML), New York City, New York, 2016. [arXiv] (acceptance rate 24%)

[C12] B. Kveton, Z. Wen, A. Ashkan, and C. Szepesvari, "Combinatorial Cascading Bandits", Neural Information Processing Systems (NIPS) Proceedings, Montreal, Canada, 2015. [arXiv] (acceptance rate 22%)

[C11] B. Kveton, C. Szepesvari, Z. Wen, and A. Ashkan, "Cascading Bandits: Learning to Rank in the Cascade Model", International Conference on Machine Learning (ICML), Lille, France, 2015. [PDF][Appendix] (acceptance rate 26%)

[C10] Z. Wen, B. Kveton, and A. Ashkan, "Efficient Learning in Large-Scale Combinatorial Semi-Bandits", International Conference on Machine Learning (ICML), Lille, France, 2015. [arXiv] (acceptance rate 26%)

[C9] A. Ashkan, B. Kveton, S. Berkovsky, and Z. Wen, "Optimal Greedy Diversity for Recommendation", International Joint Conference on Artificial Intelligence (IJCAI), Buenos Aires, Argentina, 2015. [arXiv][ACM DL] (acceptance rate 28.8%)
- Short Conference Paper [SC1] is a preliminary version of part of this paper

[C8] B. Kveton, Z. Wen, A. Ashkan, and C. Szepesvari, "Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits", the 18th International Conference on Artificial Intelligence and Statistics (AISTATS), San Diego, California, 2015. (acceptance rate 26.7%)

[C7] V. Gabillon, B. Kveton, Z. Wen, B. Eriksson, and S. S Muthukrishnan, "Large-Scale Optimistic Adaptive Submodularity", AAAI Conference on Artificial Intelligence, Quebec City, Canada, 2014. (acceptance rate 28%)

[C6] B. Kveton, Z. Wen, A. Ashkan, H. Eydgahi, and B. Eriksson, "Matroid Bandits: Fast Combinatorial Optimization with Learning", Conference on Uncertainty in Artificial Intelligence (UAI), Quebec City, Canada, 2014. (plenary presentation, acceptance rate 8.2%)
- Workshop Paper [W3] is a preliminary version of part of this paper

[C5] Z. Wen and B. Van Roy, "Efficient Exploration and Value Function Generalization in Deterministic Systems", Neural Information Processing Systems (NIPS) Proceedings, Lake Tahoe, Nevada, 2013. (acceptance rate 25%)

[C4] V. Gabillon, B. Kveton, Z. Wen, B. Eriksson, and S. S Muthukrishnan, "Adaptive Submodular Maximization in Bandit Setting", Neural Information Processing Systems (NIPS) Proceedings, Lake Tahoe, Nevada, 2013. (acceptance rate 25%)

[C3] Z. Wen, B. Kveton, B. Eriksson, and S. Bhamidipati, "Sequential Bayesian Search", International Conference on Machine Learning (ICML), Atlanta, Georgia, 2013. (acceptance rate: 24%)
- Workshop Paper [W2] is a preliminary version of part of this paper

[C2] Z. Wen, L. Durlofsky, B. Van Roy, and K. Aziz, "Use of Approximate Dynamic Programming for Production Optimization", Society of Petroleum Engineers (SPE) Reservoir Simulation Symposium, The Woodlands, Texas, 2011. [PDF]

[C1] Z. Wen, S. Roy, and A. Saberi, "On the Disturbance Response and External Stability of a Saturating Static-Feedback-Controlled Double Integrator", American Control Conference (ACC), New York City, New York, 2007.

Short Conference Papers

[SC4] X. Lu, Z. Wen, B. Kveton, "Efficient Online Recommendation via Low-Rank Ensemble Sampling", short paper in RecSys 2018 Proceedings, Vancouver, Canada.

[SC3] G. Theocharous, N. Vlassis, and Z. Wen, "An Interactive Points of Interest Guidance System", poster at the 22nd Annual Meeting of the Intelligent User Interfaces Community (ACM IUI), Limassol, Cyprus, 2017.

[SC2] S. Zong, B. Kveton, S. Berkovsky, A. Ashkan, N. Vlassis, and Z. Wen, "Does Weather Matter? Causal Analysis of TV Logs", poster at World Wide Web Conference (WWW), Perth, Western Australia, 2017. [arXiv]

[SC1] A. Ashkan, B. Kveton, S. Berkovsky, and Z. Wen, "Diversified Utility Maximization for Recommendations", poster at ACM Conference on Recommender Systems (RecSys), Silicon Valley, California, 2014.

Workshop Papers

[W10] Z. Yu, J. Zhang, Z. Wen, A. Tacchetti, M. Wang, I. Gemp, "Teamwork Reinforcement Learning with Concave Utilities", Gamification and Multiagent Solutions Workshop, ICLR 2022.

[W9] R. Zhang, C. Chen, Z. Gan, Z. Wen, W. Wang, and L. Carin, "Nested-Wasserstein Self-Imitation Learning for Sequence Generation", Deep Reinforcement Learning Workshop, NeurIPS 2019, Vancouver, Canada.

[W8] R. Zhang, C. Chen, Z. Gan, W. Wang, Z. Wen, and L. Carin, "Sequence Generation with a Guider Network", ICML'19 Workshop on Real-World Sequential Decision Making, Long Beach, CA.

[W7] R. Zhang, Z. Wen, C. Chen, and L. Carin, "Scalable Thompson Sampling via Optimal Transport", Infer to Control Workshop on Probabilistic Reinforcement Learning and Structured Control at NIPS 2018, Montréal, Canada.

[W6] A. Lobos, P. Grigas, Z. Wen, and K. Lee, "Optimal Bidding, Allocation and Budget Spending for a Demand Side Platform Under Many Auction Types", AdKDDTargetAd2018 Workshop at KDD 2018, London, United Kingdom. [arXiv] (Best Student Paper)

[W5] S. Li, Y. Abbasi-Yadkori, B. Kveton, S. Muthukrishnan, V. Vinay, and Z. Wen, "Offline Evaluation of Ranking Policies with Click Models", CausalML 2018, Stockholm, Sweden.

[W4] C. Vernade, B. Kveton, Y. Abbasi-Yadkori, M. Ghavamzadeh, and Z. Wen, "Rank-1 A/B Testing", Women in Machine Learning Workshop 2017.

[W3] P. Grigas, A. Lobos, Z. Wen, and K. Lee, "Profit Maximization for Online Advertising Demand-Side Platforms", AdKDDTargetAd2017 Workshop at KDD 2017, Halifax, Canada. [arXiv]

[W2] B. Kveton, Z. Wen, A. Ashkan, and H. Eydgahi. "Matroid Bandits: Practical Large-Scale Combinatorial Bandits", in Proceedings of AAAI Workshop on Sequential Decision-Making with Big Data, Quebec City, Canada, 2014.

[W1] Z. Wen, B. Kveton, and S. Bhamidipati, "Learning to Discover: A Bayesian Approach", NIPS Workshop on Bayesian Optimization and Decision Making, Lake Tahoe, Nevada, 2012.

arXiv Preprints & Other Publication

X. Lu, I. Osband, S. M. Asghari, S. Gowal, V. Dwaracherla, Z. Wen, B. Van Roy, "Robustness of Epinets against Distributional Shifts", arXiv preprint. [arXiv]

W. Mou, Z. Wen, and X. Chen, "On the Sample Complexity of Reinforcement Learning with Policy Space Generalization", arXiv preprint. [arXiv]

B. Kveton, S. Mahdian, S. Muthukrishnan, Z. Wen, and Y. Xian, "Waterfall Bandits: Learning to Sell Ads Online", arXiv preprint. [arXiv]

S. Vaswani, B. Kveton, Z. Wen, A. Rao, M. Schmidt, and Y. Abbasi-Yadkori, "New Insights into Bootstrapping for Bandits", arXiv preprint. [arXiv]

B. Kveton, C. Szepesvari, A. Rao, Z. Wen, Y. Abbasi-Yadkori, and S. Muthukrishnan, "Stochastic Low-Rank Bandits", arXiv preprint. [arXiv]

Z. Wen, E. Bax, and J. Li, "Revenue-Maximizing Mechanism Design for Quasi-Proportional Auctions", arXiv preprint. [arXiv]

B. Kveton, Z. Wen, A. Ashkan, and M. Valko, "Learning to Act Greedily: Polymatroid Semi-Bandits", arXiv preprint. [arXiv]

Z. Wen, "Recommendation System Based on Collaborative Filtering", Stanford CS229 Course Project Report, 2008. [PDF]

Code

Epistemic Neural Networks. This is a general interface for uncertainty modeling in deep learning. All existing approaches to uncertainty modeling, such as Bayesian neural networks (BNNs), can be expressed as ENNs. However, there are ENN architectures that can not be expressed as BNNs. This library provides interfaces and tools for designing and training ENNs.

The Neural Testbed. This is a system for evaluating the performance of epistemic neural networks, which are models that generate joint predictions over multiple outputs in response to multiple inputs. We have also included a set of baseline agents.

Patents & Filed Patents

[P10] Efficient Exploration of Offline Models to Warm Start Online Bandit Learning, with G. Theocharous, Y. Abbasi-Yadkori, and Q. Wu, US Patent App., filed, 2019.

[P9] Multi-Task Equidistant Embedding, with H. Zhao, S. Kim, S. Li and B. Kveton, US Patent App., filed, 2018.

[P8] Change Point Detection in a Multi-Armed Bandit Recommendation System, with Y. Cao and B. Kveton, US Patent App., filed, 2018.

[P7] Online Training of Segmentation Model via Interactions With Interactive Computing Environment, with T. Yu, B. Kveton, and H. Bui, US Patent App., filed, 2018.

[P6] Multivariate Digital Campaign Content Testing Utilizing Rank-1 Best-Arm Identification, with Y. Abbasi-Yadkori, M. Ghavamzadeh, B. Kveton, and C. Vernade, US Patent App., filed, 2018.

[P5] Training And Utilizing Item-level Importance Sampling Models for Offline Evaluation and Execution of Digital Content Selection Policies, with S. Li, Y. Abbasi-Yadkori, B. Kveton, and V. Vinay, US Patent App., filed, 2018.

[P4] Online Diverse Set Generation From Partial-Click Feedback, with B. Kveton, P. Gupta, I. A. Burhanuddin, H. Singh, and G. Hiranandani, US Patent App., filed, 2018.

[P3] Influence Maximization Determination in a Social Network System, with S. Vaswani, B. Kveton, and M. Ghavamzadeh, US Patent App., filed, 2017.

[P2] Content-Adaptive Digital Content Adjustment Method and System, with Haojian Jin and Yale Song, US Patent 9942581B2. This patent was filed when I was at Yahoo Labs.

[P1] WO2014088564A1, Bayesian Content Discovery, with B. Kveton and S. Bhamidipati. This patent was filed when I was at Technicolor Research.

Theses

Z. Wen, "Efficient Reinforcement Learning with Value Function Generalization", Ph.D. Dissertation, Stanford University. [PDF]

Z. Wen, "On the External Stability of Linear Systems with Actuator Saturation Constraints, and the Decentralized Control of Communicating-Agent Networks with Security Constraints", M.S. Thesis, Washington State University.