Tag: Evaluation Metrics

Evaluation metrics define whether an agent truly performs well. This tag covers episodic return, stability, variance, success rates, multi-seed averaging, and how to measure real progress in RL experiments.