Create pivot tables for the mean of the binary outcome, Database Management System

Create pivot tables for the mean of the binary outcome

Assignment: Data Analytics

I. Financial Condition of Banks. The file Banks.csv includes data on a sample of 20 banks. The "Financial Condition" column records the judgment of an expert on the financial condition of each bank. This outcome variable takes one of two possible values-weak or strong-according to the financial condition of the bank. The predictors are two ratios used in the financial analysis of banks: TotLns&Lses/Assets is the ratio of total loans and leases to total assets and TotExp/Assets is the ratio of total expenses to total assets. The target is to use the two ratios for classifying the financial condition of a new bank. Run a logistic regression model (on the entire dataset) that models the status of a bank as a function of the two financial measures provided. Specify the success class as weak (this is similar to creating a dummy that is 1 for financially weak banks and 0 otherwise), and use the default cutoff value of 0.5.

i. Consider a new bank whose total loans and leases/assets ratio = 0.6 and total expenses/assets ratio = 0.11. From your logistic regression model, estimate the following four quantities for this bank (use R to do all the intermediate calculations; show your final answers to four decimal places): the logit, the odds, the probability of being financially weak, and the classification of the bank (use cutoff = 0.5).

ii. The cutoff value of 0.5 is used in conjunction with the probability of being financially weak. Compute the threshold that should be used if we want to make a classification based on the odds of being financially weak, and the threshold for the corresponding logit.

iii. When a bank that is in poor financial condition is misclassified as financially strong, the misclassification cost is much higher than when a financially strong bank is misclassified as weak. To minimize the expected cost of misclassification, should the cutoff value for classification (which is currently at 0.5) be increased or decreased?

II. Competitive Auctions on eBay. The file eBayAuctions.csv contains information on 1972 auctions transacted on eBay.com during May-June 2004. The goal is to use these data to build a model that will distinguish competitive auctions from noncompetitive ones. A competitive auction is defined as an auction with at least two bids placed on the item being auctioned. The data include variables that describe the item (auction category), the seller (his or her eBay rating), and the auction terms that the seller selected (auction duration, opening price, currency, day of week of auction close). In addition, we have the price at which the auction closed. The goal is to predict whether or not an auction of interest will be competitive. Data preprocessing. Create dummy variables for the categorical predictors. These include Category (18 categories), Currency (USD, GBP, Euro), EndDay (Monday-Sunday), and Duration (1, 3, 5, 7, or 10 days).

i. Create pivot tables for the mean of the binary outcome (Competitive?) as a function of the various categorical variables (use the original variables, not the dummies). Use the information in the tables to reduce the number of dummies that will be used in the model. For example, categories that appear most similar with respect to the distribution of competitive auctions could be combined.

ii. Split the data into training (60%) and validation (40%) datasets. Run a logistic model with all predictors with a cutoff of 0.5. c. If we want to predict at the start of an auction whether it will be competitive, we cannot use the information on the closing price. Run a logistic model with all predictors as above, excluding price. How does this model compare to the full model with respect to predictive accuracy?

iii. Interpret the meaning of the coefficient for closing price. Does closing price have a practical significance? Is it statistically significant for predicting competitiveness of auctions? (Use a 10% significance level.)

iv. Use stepwise selection (use function step() in the stats package or function stepAIC() in the MASS package) and an exhaustive search (use function glmulti() in package glmulti) to find the model with the best fit to the training data. Which predictors are used?

v. Use stepwise selection and an exhaustive search to find the model with the lowest predictive error rate (use the validation data). Which predictors are used?

Format your assignment according to the following formatting requirements:

i) The answer should be typed, using Times New Roman font (size 12), double spaced, with one-inch margins on all sides.

ii) The response also includes a cover page containing the title of the assignment, the student's name, the course title, and the date. The cover page is not included in the required page length.

iii) Also include a reference page. The Citations and references must follow APA format. The reference page is not included in the required page length.

Attachment:- banks.rar

View Complete Question

Request for Solution File

Ask an Expert for Answer!!

Database Management System: Create pivot tables for the mean of the binary outcome

Reference No:- TGS03093718

Expected delivery within 24 Hours

Have a Question? (oR Write a Review)

Write atleast 100 words!!

Request for Solution File

Ask an Expert for Answer!!

Database Management System: Create pivot tables for the mean of the binary outcome

Reference No:- TGS03093718

Have a Question? (oR Write a Review)

Recent Questions Asked Database Management System

Q : What are the basic approaches to bundling sas

Q : Discuss what went right and wrong on the project

Q : What is maximum number of comparisons your algorithm makes

Q : List some of your personal social and cultural identities

Q : Create pivot tables for the mean of the binary outcome

Q : Why did oracle consider them necessary

Q : Identify a possibly population to gather data from

Q : Explain how routing switching and the physical layers

Q : How are cloud service providers are improve cloud security

Getting the agreement for counseling

What is the dodo bird verdict

Development of early counseling theories

Identify who is the light of their lives

Problem about reluctance to take adhd medications

Identify an ethical boundary or boundaries

What is a key characteristic of schizoid people

Request for Solution File

Ask an Expert for Answer!!

Database Management System: Create pivot tables for the mean of the binary outcome

Reference No:- TGS03093718

Recent Questions Asked Database Management System

Q : What are the basic approaches to bundling sas

Q : Discuss what went right and wrong on the project

Q : What is maximum number of comparisons your algorithm makes

Q : List some of your personal social and cultural identities

Q : Create pivot tables for the mean of the binary outcome

Q : Why did oracle consider them necessary

Q : Identify a possibly population to gather data from

Q : Explain how routing switching and the physical layers

Q : How are cloud service providers are improve cloud security

Asked Questions