Linh Le, Minh Hoang Nguyen, Duc Kieu, Hung Le, Hung The Tran, Sunil Gupta
European Conference on Artificial Intelligence (ECAI) 2025
We study cross-domain offline RL with limited target data, where neural domain-gap estimators often overfit and only part of the source dataset overlaps with the target domain. We propose DmC, combining a k-NN proximity estimator with a nearest-neighbor–guided diffusion model to generate target-aligned source samples, and show strong gains over prior methods on MuJoCo benchmarks.
Linh Le, Minh Hoang Nguyen, Duc Kieu, Hung Le, Hung The Tran, Sunil Gupta
We study cross-domain offline RL with limited target data, where neural domain-gap estimators often overfit and only part of the source dataset overlaps with the target domain. We propose DmC, combining a k-NN proximity estimator with a nearest-neighbor–guided diffusion model to generate target-aligned source samples, and show strong gains over prior methods on MuJoCo benchmarks.
Linh Le, Minh Hoang Nguyen, Hung Le, Hung The Tran, Sunil Gupta
European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML-PKDD) 2025
Offline robust RL is appealing when only fixed data is available, but it typically needs large datasets, while simulator data is cheaper yet suffers from dynamics mismatch. We propose HYDRO, a hybrid cross-domain robust RL framework that uses an online simulator alongside limited offline data, selecting reliable simulator samples via performance-gap–based uncertainty filtering and prioritized sampling, and show it outperforms prior methods across diverse tasks.
Linh Le, Minh Hoang Nguyen, Hung Le, Hung The Tran, Sunil Gupta
Offline robust RL is appealing when only fixed data is available, but it typically needs large datasets, while simulator data is cheaper yet suffers from dynamics mismatch. We propose HYDRO, a hybrid cross-domain robust RL framework that uses an online simulator alongside limited offline data, selecting reliable simulator samples via performance-gap–based uncertainty filtering and prioritized sampling, and show it outperforms prior methods across diverse tasks.
Minh Hoang Nguyen, Linh Le, Thommen George Karimpanal, Sunil Gupta, Hung Le
International Joint Conference on Artificial Intelligence (IJCAI) 2025
Decision Transformers often struggle in real-world offline RL because datasets are limited and dominated by suboptimal behavior. We propose CRDT, which injects counterfactual experiences to enable out-of-distribution reasoning and trajectory stitching without architectural changes, and show consistent improvements over standard DT on Atari and D4RL, including limited-data and dynamics shifted settings.
Minh Hoang Nguyen, Linh Le, Thommen George Karimpanal, Sunil Gupta, Hung Le
Decision Transformers often struggle in real-world offline RL because datasets are limited and dominated by suboptimal behavior. We propose CRDT, which injects counterfactual experiences to enable out-of-distribution reasoning and trajectory stitching without architectural changes, and show consistent improvements over standard DT on Atari and D4RL, including limited-data and dynamics shifted settings.
Linh Le, Hung The Tran, Sunil Gupta
International Conference on Autonomous Agents and Multiagent Systems (AAMAS) 2024
We tackle cross-domain policy transfer under large dynamics mismatch, where the common "full support" assumption (the simulator covers all target transitions) is unrealistic. We propose a simple method that skews and extends source support toward target support to reduce support deficiencies, and show consistent gains over prior approaches across diverse benchmarks.
Linh Le, Hung The Tran, Sunil Gupta
We tackle cross-domain policy transfer under large dynamics mismatch, where the common "full support" assumption (the simulator covers all target transitions) is unrealistic. We propose a simple method that skews and extends source support toward target support to reduce support deficiencies, and show consistent gains over prior approaches across diverse benchmarks.