Let’s say blood pressure (BP) measurements are more likely to be missing among young people, who generally have lower blood pressure. You use mean imputation to train your model. Which option correctly characterizes the mean BP (after imputation) in your training dataset?

Question

topGrad Admin · Accepted Answer

The BP measurements which will be present in your dataset will be higher than average, since many of the lower BP values from the younger people in the dataset will be missing.   Therefore the mean which you impute will be higher than average, so the overall mean will be higher. This is an example of bias introduced in your dataset through imputation.

Q:

Data Analysis

Machine Learning

Similar Questions

True or False: A tree of depth 1 is more expressive than a classical linear model.

Using regression imputation, and the decision tree shown here, what is your prediction for this person’s risk of heart attack?

Assume you have missing data on one of your features, and are considering two options: True or False: “Both options have the same performance”.

One way to aggregate predictions from multiple trees is by a majority vote. Using this aggregation rule, select the prediction of the following trees on the data point (x=4, y=7, z=2):

You’ve fit a random forest of 10 trees with max depth 20. Your training ROC is 0.99 and test ROC is 0.54. Which of the following is NOT a reasonable thing to try?

True or False: When your data is missing at random, then whether or not you are missing a covariate is completely independent of your outcome.

Let’s say blood pressure (BP) measurements are more likely to be missing among young people, who generally have lower blood pressure. You use mean imputation to train your model. Which option correctly characterizes the mean BP (after imputation) in your training dataset?

Which decision boundary corresponds to the following decision tree? In the options, red indicates high risk, blue indicates low risk.

When is complete case analysis least likely to bias your model?

You have created a model using mean imputation. At test time, you should fill in missing values with:

Practice More Questions

Data Analysis

Machine Learning

Q:

Created with Fabric.js 4.6.0 Similar Questions

Created with Fabric.js 4.6.0 Practice More Questions

Similar Questions

Practice More Questions