邓天虎: Data-driven Convex Policy Optimization in an Assemble-to-order System

发布时间：2023-10-11

点击：

来源：管理学院

报告时间：2023年10月13日（星期五）14:30-15:30

报告地点：管理学院新大楼第二学术报告厅

报告人：邓天虎博士

工作单位：清华大学

举办单位：管理学院

报告简介：

This paper investigates the optimization of periodic-review assemble-to-order (ATO) production systems with multiple products assembled from multiple components, under the data-driven setting where only historical demand data is available and demand distributions are unknown. To address this challenge, we propose a semi-model-based fitted Q iteration (S-FQI) algorithm framework that leverages the known transition dynamics. We provide a proof of the statistical convergence rate of the proposed algorithm concerning the number of iterations, the number of demand samples, and the number of generated trajectories.

Additionally, we introduce the convex-TD3 (CTD3) algorithm to tackle practical challenges by incorporating the convex property of ATO systems and utilizing an input convex neural network (ICNN) to improve efficiency and effectiveness.

报告人简介：

邓天虎，邓天虎（博士，副教授）目前就职于清华大学工业工程系。2013年于美国加州大学伯克利分校获得工业工程与运筹博士学位，2008年于清华大学工业工程系获得学士学位。目前研究方向侧重智慧供应链。以第一作者和通讯作者在Manufacturing & Service Operations Management、Operations Research等国际学术期刊和学术会议发表论文20余篇。

上一篇：唐朝生: 土体龟裂现象-规律-本质

下一篇：刘俊驿: Data-driven Piecewise Affine Decision Rule Methods for Stochastic Optimization with Covariate Information

本月热点