pub-3785248829033902 | Python: Statistics #2 | - Big data with Python

| Python: Statistics #2 |

The best way to learn new things is to take a practical approach of the things you want to learn. Here is a fragment of code that demonstrates the use of the statistics statement in Python:

Æ’ python command

from statistics import mean
from random import shuffle

drug = [54, 73, 53, 70, 73, 68, 52, 65, 65]
placebo = [54, 51, 58, 44, 55, 52, 42, 47, 58, 46]
observed_diff = mean(drug) - mean(placebo)
n = 10000
count = 0
combined = drug + placebo
for i in range(n):
    shuffle(combined)
    new_diff = mean(combined[:len(drug)]) - mean(combined[len(drug):])
    count += (new_diff >= observed_diff)

print(f'{n} label reshufflings produced only {count} instances with a difference')

print(f'at least as extreme as the observed difference of {observed_diff:.1f}.')

print(f'The one-sided p-value of {count / n:.4f} leads us to reject the null')

print(f'hypothesis that there is no difference between the drug and the placebo.')

output

10000 label reshufflings produced only 10 instances with a difference
at least as extreme as the observed difference of 13.0.
The one-sided p-value of 0.0010 leads us to reject the null
hypothesis that there is no difference between the drug and the placebo.    


Previous
Next Post »