Why we expect that we can easily pick m sufficiently large, Programming Languages

Why we expect that we can easily pick m sufficiently large

Problem: Adaboost

I. Consider running the Perceptron algorithm on a training set S arranged in a certain order. Now suppose we run it with the same initial weights and on the same training set but in a different order, S 0 . Does Perceptron make the same number of mistakes? Does it end up with the same final weights? If so, prove it. If not, give a counterexample, i.e. an S and S 0 where order matters.

II. We have mainly focused on squared loss, but there are other interesting losses in machine learning. Consider the following loss function which we denote by φ(z) = max(0, -z). Let S be a training set (x 1 , y1 ), . . . ,(x m, ym) where each x i ∈ R n and y i ∈ {-1, 1}. Consider running stochastic gradient descent (SGD) to find a weight vector w that minimizes 1 m Pm i=1 φ(y i • w T x i ). Explain the explicit relationship between this algorithm and the Perceptron algorithm. Recall that for SGD, the update rule when the i th example is picked at random is wnew = wold - η∇φ y iw T x i .

III. Here we will give an illustrative example of a weak learner for a simple concept class. Let the domain be the real line, R, and let C refer to the concept class of "3-piece classifiers", which are functions of the following form: for θ1 < θ2 and b ∈ {-1, 1}, hθ1,θ2,b(x) is b if x ∈ [θ1, θ2] and -b otherwise. In other words, they take a certain Boolean value inside a certain interval and the opposite value everywhere else. For example, h10,20,1(x) would be +1 on [10, 20], and -1 everywhere else. Let H refer to the simpler class of "decision stumps", i.e. functions hθ,b such that h(x) is b for all x ≤ θ and -b otherwise.

i. Show formally that for any distribution on R (assume finite support, for simplicity; i.e., assume the distribution is bounded within [-B, B] for some large B) and any unknown labeling function c ∈ C that is a 3-piece classifier, there exists a decision stump h ∈ H that has error at most 1/3, i.e. P[h(x) 6= c(x)] ≤ 1/3. (b) Describe a simple, efficient procedure for finding a decision stump that minimizes error with respect to a finite training set of size m. Such a procedure is called an empirical risk minimizer (ERM).

ii. Give a short intuitive explanation for why we should expect that we can easily pick m sufficiently large that the training error is a good approximation of the true error, i.e. why we can ensure generalization. (Your answer should relate to what we have gained in going from requiring a learner for C to requiring a learner for H.) This lets us conclude that we can weakly learn C using H.

The response should include a reference list. Using double-space, Times New Roman 12 pnt font, one-inch margins, and APA style of writing and citations.

View Complete Question

Request for Solution File

Ask an Expert for Answer!!

Programming Languages: Why we expect that we can easily pick m sufficiently large

Reference No:- TGS03137843

Expected delivery within 24 Hours

Have a Question? (oR Write a Review)

Write atleast 100 words!!

Request for Solution File

Ask an Expert for Answer!!

Programming Languages: Why we expect that we can easily pick m sufficiently large

Reference No:- TGS03137843

Have a Question? (oR Write a Review)

Recent Questions Asked Programming Languages

Q : What do you believe is most effective way to avoid usability

Q : How suitable are the choices and deployment of the features

Q : How does a support specialist access the tool

Q : Discuss why selected day and time is ideal for maintenance

Q : Why we expect that we can easily pick m sufficiently large

Q : Discuss companys trademark that has been awarded protection

Q : Discuss the importance of information protection

Q : Analyze a few situations that can occur if critical sections

Q : Describe data methodology and results in a technical report

Discuss signs and symptoms of hpv related cancer

Describe structured multimodal pain management program

Discuss client with severe atherosclerotic disease

Reflect on the definition and goal of ebp

Examine the process of putting a new policy into place

Essential information for early childhood professionals

Discuss about the value of examining your personal biases

Request for Solution File

Ask an Expert for Answer!!

Programming Languages: Why we expect that we can easily pick m sufficiently large

Reference No:- TGS03137843

Recent Questions Asked Programming Languages

Q : What do you believe is most effective way to avoid usability

Q : How suitable are the choices and deployment of the features

Q : How does a support specialist access the tool

Q : Discuss why selected day and time is ideal for maintenance

Q : Why we expect that we can easily pick m sufficiently large

Q : Discuss companys trademark that has been awarded protection

Q : Discuss the importance of information protection

Q : Analyze a few situations that can occur if critical sections

Q : Describe data methodology and results in a technical report

Asked Questions