Get the latest Science News and Discoveries

SharpeRatio@k: novel metric for evaluation of risk-return tradeoff in off-policy evaluation - EurekAlert

SharpeRatio@k, a novel evaluation metric for Off-Policy Evaluation estimators, effectively measures the risk-return tradeoff of evaluating policies used in reinforcement learning and contextual bandits, which are typically ignored by conventional metrics, show scientists at Tokyo Tech. This novel metric, inspired from risk assessment in financial portfolio management, provides a more insightful evaluation of OPE, paving the way for improved policy selection.

None

Get the Android app

Or read this on Eureka Alert