16:332:541 Stochastic Signals and Systems Assignment 1

Due: October 8, 2021

Suppose you have n suitcases and suitcase i holds Xi dollars where X1,X2, . . . ,Xn are iid continuous uniform (0,m) random variables. (Think of a number like one million for the symbol m.) Unfortunately, you don’t know Xi until you open suitcase i.

Suppose you can open the suitcases one by one, starting with suitcase n and going down to suitcase 1. After opening suitcase i, you can either accept or reject Xi dollars. If you accept suitcase i, the game ends. If you reject, then you get to choose only from the still unopened suitcases.

What should you do? Perhaps it is not so obvious? In fact, you can decide before the game on a policy, a set of rules to follow about whether to accept suitcase the Xi dollars in suitcase i. We will specify a policy by a vector (τ1, . . . ,τn) of threshold parameters such that

• After opening suitcase i, you accept the amount Xi if Xi ≥ τi; otherwise, you reject suitcase i and open suitcase i− 1.

Your would like to choose a policy (τ∗1 ,τ ∗ 2 , . . . ,τ

∗ n) that maximizes your expected winnings.

(a) Find the conditional expected value E[Xi|Xi ≥ τi].

(b) Let W1(τ1) denote your winnings given there is just 1 unopened suitcase and your threshold is τ1. What is E[W1(τ1)] as a function of τ1? We can write the value of τ1 that maximizes E[W1(τ1)] as

τ∗1 = arg max τ1

E[W1(τ1)].

Show that τ∗1 = 0. That is, you should never reject what you get in the last suitcase.

(c) Let Wk(τ1, . . . ,τk) denote your reward given that there are k unopened suitcases re- maining and the thresholds are τ1, . . . ,τk. As a function of τk, find a recursive rela- tionship for E[Wk(τ1, . . . ,τk)] in terms of τk and E[Wk−1(τ1, . . . ,τk−1)].

(d) When there are 2 suitcases left, you choose the threshold

τ∗2 = arg max τ2

E[W2(τ2,τ ∗ 1 )]

What is τ∗2 ? (Keep in mind that τ ∗ 1 = 0.)

(e) Suppose you have already found the optimized thresholds τ∗1 , . . . ,τ ∗ k−1. When there are

k suitcases left, you choose the optimized threshold

τ∗k = arg max τk

E [ Wk(τ

∗ 1 , . . . ,τ

∗ k−1,τk)

] Find the policy the optimized policy (τ∗1 , . . . ,τ

∗ 4 ).

(f) Define W∗k = Wk(τ ∗ 1 , . . . ,τ

∗ k ) and αk = E[W

∗ k ]/m. How are α1,α2, . . . ,αk related?

(g) What is limk→∞ αk?

1

Get help from top-rated tutors in any subject.

Efficiently complete your homework and academic assignments by getting help from the experts at homeworkarchive.com