Select examples that would cause the largest change in model parameters:
\[x^* = \arg\max_x \mathbb{E}_{y \sim p(y|x)} \left[\|\theta_{t \mid {y, x}} - \theta_t\|\right],\]
where \(\theta_{t \mid {y, x}}\) are the model parameters after also training on \(x\) labeled with \(y\).
Choose examples that minimize expected future error:
\[x^* = \arg\min_x \mathbb{E}_{y \sim p(y|x)} [\sum_{x' \in \mathcal{X}_\text{pool}} \mathbb{E}_{y' \sim p(y'|x')} \mathcal{L}(\theta_{t \mid y, x}; y', x') ].\]
Why does this not contain the current loss?