Robustness


Lecture 25

November 15, 2023

  Activating project at `~/Teaching/BEE4750/fall2023/slides`

Review and Questions

Misspecification and Deep Uncertainty

  • Both model misspecification and deep uncertainty complicate decision-making (and hence prescriptive analysis)
  • May end up with decisions which satisfy one set of preferences or beliefs but not another or which stem from wrong parameters

Robustness

What is Robustness?

Robustness: How well a decision performs under plausible alternative specifications.

Note: We are using “robustness” in a slightly different sense than some other related uses.

If you see “robust optimization,” that is a different thing: an approach to mathematical programming under uncertainty.

The Basic Idea of Robustness

Robustness: summarizing how well a decision works across a range of different plausible futures.

Robustness can be measured using a variety of metrics (more on this later).

The Lake Problem and Robustness

Suppose we estimate \(q=2.5\), \(b=0.4\), and runoffs \(y_t\sim~LogNormal(0.03, 0.25).\)

After testing against 1000 runoff samples, we decide to try a release of \(a_t = 0.04\) units each year.

But What If We’re Wrong?

Now suppose \(q=2.4\) instead…

But What If We’re Wrong?

Or \(b=0.39\)

But What If We’re Wrong?

Or \(b=0.41\)

Constraint Performance Varying \(q\) and \(b\)

Robustness Metrics

Robustness Metrics

Given an assessment of performance over a variety of specifications (or states of the world), there are a number of metrics that can be used to capture robustness, and the choice can matter quite a bit.

Two common ones are satisfycing and regret.

Satisfycing

Satisfycing metrics try to express the degree to which performance criteria are satisfied across the considered states of the world.

Satisfycing

A simple satisfycing metric: what is the fraction of states of the world (SOWs) in which the criterion is met, or

\[S=\frac{1}{N}\sum_{n=1}^N I_n,\]

where \(I_n\) indicates if the performance criterion is met in SOW \(n\).

Satisfycing

Be Careful With Ranges

Over these ranges, we could calculate a satisfycing score of 41%.

If we shrink the range of \(q\) to be \((2, 3)\), that goes up to 58%.

Which is “right”?

Alternative Satisfycing Metrics

Other satisfycing metrics might measure the “distance” from the baseline case before the system fails.

Regret

Regret metrics capture how much we “regret” (or lose in objective value) worse performances across SOWs.

Regret Metrics

A simple regret metric: what is the average worsening of performance across SOWs?

\[R = \frac{1}{N} \sum_{n=1}^N \frac{\min(P_n - P_\text{base}, 0)}{P_\text{base}},\]

where \(P_\text{base}\) is the performance in the design SOW and \(P_n\) is the performance in SOW \(n\).

Regret Metrics

Alternative Regret Metrics

\[R_2 = \frac{1}{N} \sum_{n=1}^N \frac{P^\text{base}_n - P^\text{opt}_n}{P^\text{opt}_n}\]

where \(P^\text{opt}_n\) is the performance of the optimal decision in SOW \(n\).

Alternative Regret Metrics

Other options:

  • Maximin (goal: maximize worst-case outcome)
  • Maximax (goal: maximize best-case outcome)

Optimizing for Robustness

Often robustness is used to stress-test an identified decision alternative.

We could also use these metrics during our optimization procedure by:

  • minimizing regret or maximizing satisfycing as an objective;
  • using either as a constraint.

Optimizing for Robustness

Just be careful:

These different metrics measure very different things and can rank decisions differently.

Other Considerations

  • How are different SOWs generated?
  • Are there multiple objectives/thresholds of interest?

Key Takeaways

Key Takeaways

  • Robustness is an approach to measuring how well a decision performs under mis-specification.
  • Common approaches: Satisfycing and Regret
  • Many different robustness metrics, make sure you choose one appropriate for the context.
  • Metrics sensitive to considered alternative SOWs, select ranges/distributions carefully.

Upcoming Schedule

Upcoming Schedule

Friday: Sensitivity Analysis

After Thanksgiving: No class, will schedule 10-minute meetings with teams during class time the week after Thanksgiving to check on project progress. Attendance is required during your meeting.

Assessments

Lab 4: Due Friday

Project: Make sure you check what’s due when!