UnderSampling the dataset using R

diabetes <- read.csv(“diabetes-dataset.csv”, sep = “,”, header = TRUE)
Summary of dataset
diabetes$Outcome <- as.factor(diabetes$Outcome)
Summary of dataset post factorisation of Output variable
#holds instances where outcome is 1
diabetes_true <- diabetes[(diabetes$Outcome == 1), ]
#holds instances where outcome is 0
diabetes_false <- diabetes[(diabetes$Outcome == 0), ]
#UnderSampling the data for biasing the outcome
diabetes_false <-
diabetes_false[sample(nrow(diabetes_false),1026, replace = FALSE, prob = NULL),]
diabetes_final <- rbind(diabetes_true,diabetes_false)
Before Sampling (Biased Data)
After Sampling



An enthusiast in the field of Data Science and Technology