Data Activity 5 / Unit 7

Task

Find out mean, median and mode of variables sbp, dbp and income. Find out the five-figure summary of income variable and present it using a Boxplot. Run a suitable hypothesis test to see if there is any association between systolic blood pressure and presence and absence of peptic ulcer.

Process and Findings

library(haven)
health_data <- read_sav("path location")
mean(health_data$sbp)
mean(health_data$dbp)
mean(health_data$income)

> mean(health_data$sbp)
[1] 127.7333
> mean(health_data$dbp)
[1] 82.76667
> mean(health_data$income)
[1] 85194.49

#try to do all actions in single command
colMeans(health_data[, c("sbp", "dbp", "income")], na.rm = TRUE)
# this worked but was not necessaryily quicker!
#the comma after the first square bracket indicates to select all rows (data[rows, columns])
¤remove na (na.rm) is used to ensure missing values don't effect result

> colMeans(health_data[, c("sbp", "dbp", "income")], na.rm = TRUE)
        sbp         dbp      income 
  127.73333    82.76667 85194.48571

#five figure summary of var "income"
summary(health_data$income)

> summary(health_data$income)
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  52933   68637   86561   85194   99696  117210

boxplot(health_data$income, main="income", col = "blue") # Figure 1

#or using ggplot (outlier not relevelant to this examnple but included for reference).  Figure 2
ggplot(data = health_data, aes(x = "", y = income)) +
  geom_boxplot(fill = "steelblue", outlier.colour = "red", outlier.shape = 16) +
  labs(title = "Income Distribution", x = NULL, y = "Income") +
  theme_minimal() +
  theme(
    plot.title = element_text(hjust = 0.5, size = 16, face = "bold", margin = margin(b = 15)), #center title
    axis.text.x = element_blank(),       # remove x-axis label text
    axis.ticks.x = element_blank())       # remove x-axis ticks


#Run a suitable hypothesis test to see if there is any association between systolic blood pressure and presence and absence of peptic ulcer.
#H0: there is no association between SBP and presence pectic ulcer
#H1: there is an asspcoation between SBP and presence of peptic ulcer

#run t-test
t.test(sbp ~ pepticulcer, data = health_data, var.equal = TRUE)

	Two Sample t-test

data:  sbp by pepticulcer
t = 1.2772, df = 208, p-value = 0.203
alternative hypothesis: true difference in means between group Yes and group No is not equal to 0
95 percent confidence interval:
 -2.420494 11.326830
sample estimates:
mean in group Yes  mean in group No 
         131.3171          126.8639 

Figure 1

Figure 1

Figure 2

Figure 2

Findings

P value is 0.203, as this value is >0.05 there is not enough evidence to reject H0, therefore there is not evidence to show an association between SBP and peptic ulcer.

```

⬅️ Back to Home