Data sets
Last updated on 2025-04-15 | Edit this page
BETACAR
Data from a study of bio availability of four different preparations of betacarotene:
- Solaten (30 mg capsules)
- Roche (60 mg capsules)
- Badische Anilin und Soda Fabrik (BASF) (30 mg capsules)
- BASF (60 mg capsules)
23 volunteers had their beta-carotene levels measured in two consecutive-day fasting blood samples. They were then randomized to one of the four preparations, and took 1 pill every other day for 12 weeks. Blood samples were drawn after 6, 8, 10 and 12 weeks.
Dimensions: Rows: 23 Columns: 8
Variable | Description/Code | Unit |
---|---|---|
Prepar | Preparation 1=SOL 2=ROCHE 3=BASF-30 4=BASF-60 |
|
Id | Subject # | |
Base1lvl | 1st Baseline Level | µmol/L |
Base2lvl | 2nd Baseline Level | µmol/L |
Wk6lvl | Week 6 Level | µmol/L |
Wk8lvl | Week 8 Level | µmol/L |
Wk10lvl | Week 10 Level | µmol/L |
Wk12lvl | Week 12 Level | µmol/L |
BIGI
Basic Index of Gender Inequality (BIGI) introduced by Stoet and Geary in 2019 provides a symmetric index of gender inequality. It calculates the ratio between three different equality indicators for men and women, and returns the average of them.
The ratio is calculated, and centered. Eg. if women in a given country have an expected healthy life span of 71 years, and men have an expected healthy life span of 68 years, the ratio is 71/68 = 1.044118. A value of 1 will indicate total equality.
This is done for life satisfaction (base on data from Gallup), healthy life span and education (data for the latter two from Global Gender Gap). Education is calculated as the worst ratio of values for litteracy, enrollment in primary and secondary education.
The three ratios are normalised to set parity equal to 0. A negative score will indicate that women are worse off, a positive score that men are worse of. The overall BIGI score is calculated as the average of the three indicators. A final indicator, AADP, calculates the average of the absolute values, i.e. give an indication of inequality (0 is still parity) regardless of which sex is disadvantaged.
All data is averaged over the period 2012 to 2016
Dimensions: Rows: 134 Columns: 9
Source10
Variable | Description/Code | Unit |
---|---|---|
rank | Ranking by AADP | |
overall_rank | Ranking by absolute value of BIGI | |
country | ||
BIGI | BIGI score as described | |
AADP | AADP score as described | |
basic_education | Worst ratio of educational indicators, normalised | |
healthy_life_span | Ratio of healthy life span, normalised | |
life_satisfaction | Ratio of life satisfaction, normalised | |
human_development_index | HDI for country |
BLOOD
A case-control study of riskfactors (hormone levels in blood samples) for breast cancer. Individuals are matched on age and current postmenopausal homone use (PMH).
Each row contains unique ID, representing both cases (have breast cancer) and control (does not have breast cancer). The column matchid matches the controls with their respective cases. Cases have identical ID and matchid.
Note the different ways of coding missing values.
Useful for logistic regression to assess the association between testosterone and breast cancer, controlling for age and current PMH use. Either with testosterone as a continuous variable or as a categorical variabel in quartiles, with the first quartile as the reference group.
Dimensions: Rows: 510 Columns: 9
Variable | Description | Code | Unit |
---|---|---|---|
Id | ID | ||
matchid | Matched ID | ||
case | 1 = case 0 = control |
||
curpmh | Current PMH use 1 = yes 0 = no |
||
ageblood | Age at blood draw | years | |
estradol | Estradiol | pg/mL | |
estrone | Estrone missing = 999 |
pg/mL | |
testost | Testosterone missing = 999 |
ng/dL | |
prolactn | Prolactin missing = 99.99 |
ng/L |
BONEDEN
Data from a twin study on the relationship between bone density and cigarette consumption source2. 41 pairs of middle-aged australian female twins with different smoking histories had their bone density measured at a hospital in Victoria, as well as other factors, details in metadata below.
Dimensions: Rows: 41 Columns: 25
The data set is rather wide, and the columns are split up in this describtion.
Variable | Code | Unit |
---|---|---|
ID | ||
Age | Age | years |
zyg | Mono- or dizygotic twins 1 = mz 2 = dz |
The following variables are dublicated, in the form of
xx1 for the lighter smoking twin and
xx2 for the heavier smoking twin
Variable | Code | unit |
---|---|---|
ht | Height | cm |
wt | Weight | kg |
tea | Tea | cups/week |
cof | Coffee | cups/week |
alc | Alcohol | drinks/week |
cur | Current Smoking | (cigarettes/day) |
men | Menopause Status 0: Premenopausal 1: Postmenopausal 2: unknown or hysterectomy |
|
pyr | Pack-years smoking | year |
ls | Lumbar spine | g/cm2 |
fn | Femoral neck | g/cm2 |
fs | Femoral shaft | g/cm2 |
Pack-years are defined as how many years the woman have smoked a pack of cigarettes pr day, normally ca. 20 cigarettes pr pack.
Lumbar spine: L1-L5.
Femoral neck: Collum femoris
Femoral shaft: Corpus femoris
BOTOX
69 patients with piriformis syndrome participated in a randomized double-blind clinical trial.
Piriformis syndrom is a condition involving malfunction of the piriformis muscle - a deep buttock muscle, which often causes buttock and lumbar pain with sciatica.
Patients were injected with one of three substances directly in the piriformis muscle:
- a combination of triamcinolone and lidocaine (TL)
- a placebo
- Botox
The randomization was set up with approximately ½ assigned to group 1, 1/6 to group 2 and 1/3 to group 3.
Patients were asked to return 2 weeks (0.5), and monthly post injection. At each visit patients rated their percentage of improvement of pain comparied to baseline, on a visual analog scale, with a range of -100 to 100%, negative numbers indicating worsening. One patient had the condition in both legs leading to 70 observations in the dataset.
Dimensions: Rows: 70 Columns: 24
Variable | Code | unit |
---|---|---|
ID | ||
group | 1 = TL 2 = Placebo 3 = Botox |
|
pain0 | pain score month 0 | % |
pain05 | pain score month 0.5 | % |
pain1 | pain score month 1 | % |
pain2 | pain score month 2 | % |
pain3 | pain score month 3 | % |
pain4 | pain score month 4 | % |
pain5 | pain score month 5 | % |
pain6 | pain score month 6 | % |
pain7 | pain score month 7 | % |
pain8 | pain score month 8 | % |
pain9 | pain score month 9 | % |
pain10 | pain score month 10 | % |
pain11 | pain score month 11 | % |
pain12 | pain score month 12 | % |
pain13 | pain score month 13 | % |
pain14 | pain score month 14 | % |
pain15 | pain score month 15 | % |
pain16 | pain score month 16 | % |
pain17 | pain score month 17 | % |
For all pain scores 999 indicates missing value.
BREAST
A dataset describing 1200 women in the NHS. In 1990 it was confirmed that they were postmenopausal and free of any cancer. The selection was done so that 200 of the women were using postmenopausal hormones (PMH) in 1990, and 1000 had never used PMH. The objective was to identify a possible relation between incidence of breast cancer between 1990 to 2000 with PMH use in 1990.
The objective was to relate breast cancer incidence from 1990 to 2000
with PMH use in 1990. Data on PMH use are found in three variables,
pmh
registers PMH use in 1990, where dur3
and
dur4
registers length of use of two different PMH. In the
variable foluptm
the time between the first questionnaire
(in 1990), and a follow up date is recorded in months. If a control that
follow up date was the date of the last questionnaire in 2000, if a
case, the date of diagnosis of breast cancer. Other cancer risk factors
are recorded.
Dimensions: Rows: 1200 Columns: 18
variable | Description | unit |
---|---|---|
Id | ID | |
case | case 1 = case 0 = control |
|
age | age | years |
agemenar | age at menarche | years |
agemenop | age at menopause | years |
afb | age at first birth 98=nullip | years |
parity | parity | |
bbd | Benign Breast disease 1 = yes 0 = no |
|
famhx | family history breast cancer 1 = yes 0 = no |
|
bmi | BMI | kg/m2 |
hgt | Height | inches |
alcohol | Alcohol use | g/day |
pmh | PMH status 2 = never user 3 = current user |
|
dur3 | Duration of Estrogen use (months) | months |
dur4 | Duration of Estrogen + progesterone use (months) | months |
csmk | Current Smoker 1 = yes 0 = no |
|
psmk | Past smoker 1 = yes 0 = no |
|
foluptm | Months of follow up Note: Some subjects provided no follow up after the 1990 questionnaire: foluptm=0 for these people |
months |
CORNEAL
Data from a randomized trial of two different active drugs of the fluoroquinolone-group, M and G along with a placebo, P. 93 subjects were placed in one of three groups:
Group | Eye 1 | Eye 2 |
---|---|---|
A | G | P |
B | M | P |
C | G | M |
Each subject was asked to administer the two assigned preparations four times per day for 10 days. The response was measured at baseline (without treatment), on visit 1, and again at visit 2 after 7 days, and at visit 3, on day 14. Note that at day 14, the subjects had stopped administering the preparations. The response was meased as corneal sensitivity in five regions of the eyes, central, superior, inferior, temporal and nasal. Sensititivy was measured in mm, with a range for 40-60 mm. High values indicate greater, normal, sensitivity, low values lower, abnormal, sensitivity.
Dimensions: Rows: 186 Columns: 17
Variable | Variable label | unit |
---|---|---|
id | ID | |
tr | Treatment 1 = M 2 = G 3 = P |
|
c1 | Central visit 1 | mm |
s1 | Superior visit 1 | mm |
i1 | Inferior Visit 1 | mm |
t1 | Temporal visit 1 | mm |
n1 | Nasal Visit 1 | mm |
c2 | Central Visit 2(day 7) | mm |
s2 | Superior Visit 2 | mm |
i2 | Inferior Visit 2 | mm |
t2 | Temporal Visit 2 | mm |
n2 | Nasal Visit 2 | mm |
c3 | Central Visit 3(day 14) | mm |
s3 | Superior Visit 3 | mm |
i3 | Inferior Visit 3 | mm |
t3 | Temporal Visit 3 | mm |
n3 | Nasal Visit 3 | mm |
99: Missing values
DIABETES
A study on whether maintaining diabetes control in type I diabetes affects growth and development in childhood. 94 boys aged 9-15 where examined approx. every 3 months, leading to 910 visits. At each visit the degree of diabetes control where assessed by measuring HgbA1c (glycosylated hemoglobin). The exact dates of the visits are registered in the mon, day and yr_a1c variables, the age, height and weight where registered as well ad the HgbA1c measurement.
The HgbA1c measurement is the percentage of hemoglobin that is glycosylated. The normal range is between 4.0 and 5.6%, with values larger than 6.5% being indicative of diabetes. The normal range does not appear to change by pubertal stage.
Dimensions: Rows: 910 Columns: 8
Variable | Code | unit |
---|---|---|
ID | ||
mon_a1c | Month | |
day_a1c | Day A1c | |
yr_a1c | Yr | |
age_yrs | Age in years | year |
gly_a1c | Hemoglobin A1c | % |
ht_cm | Height missing=999.9 |
cm |
wt_kg | Weight | kg |
EAR
Data on 214 children with acute otitis media in one or both ears. They were randomly assigned 14 days of antibiotic treatment with either CEF (cefaclor) or AMO (amoxicillin). Status of their ear infection at a follow up visit after 14 days were recorded.
Dimensions: Rows: 278 Columns: 5
Additional reference
Variable | Description |
---|---|
Id | ID |
Clear | Clearance by 14 days 1 = yes 0 = no |
Antibo | Antibiotic 1 = CEF 2 = AMO |
Age | Age 1 = <2 yrs 2 = 2-5 yrs 3 = 6+ yrs |
Ear | Ear 1 = 1st ear 2 = 2nd ear |
EFF
Data from a litterature search on the efficacy (that is - is it working?) of a number of aminoglycosides for the treatment of infectious diseases.
Note that this dataset is basically the same as the dataset NEPHRO, where the nephrotoxicity of the preparations described in the same papers is reported.
We get the sample size of patients in different studies, and the number of patients that were cured for their infection. Which antibiotic is best?
Dimensions: Rows: 64 Columns: 6
Variable | Description/Code |
---|---|
Name | Study name |
Id | Study Number |
Endpnt | Endpoint 1=efficacy |
Antibio | Antibiotic |
1 = Amikacin 2 = Gentamicin 3 = Netilmicin 4 = Sisomycin 5 = Tobramycin |
|
Samp_sz | Sample Size |
Cured | Number Cured |
ENDOCRIN
Data for comparison of measurements of four hormones in five subjects. Measurements were done twice on blood samples from each subject. Are the measurements reproducable?
Dimensions: Rows: 10 Columns: 6
Variable | LABEL | unit |
---|---|---|
Subject | SUBJECT # | |
Replicat | REPLICATE # | |
Estrone | ESTRONE | pg/mL |
Estradol | ESTRADIOL | pg/ml |
Androste | ANDROSTENEDIONE | ng/dL |
Testost | TESTOSTERONE | ng/dL |
ESTRADL
Data on 211 women. Measurement of estradiol, ethnicity, number of children, BMI and Waist/hip ratio.
Dimensions: Rows: 211 Columns: 10
Variable | Code | unit |
---|---|---|
Id | Identification number | |
Estradl | Estradiol | pg/ml |
Ethnic | Ethnicity 0 = African-American 1 = Caucasian |
|
Entage | Age | year |
Numchild | Parity, number of children 9=missing | |
Agefbo | Age at 1st birth (= 0 if numchild = 0) 99 = missing | year |
Anykids | Any children 1 = yes 0 = no 9 = missing |
|
Agemenar | Age at menarche 99=missing |
years |
BMI | Body Mass Index | kg/m^ |
WHR | waist-hip ratio |
CAVE: Note the coding of Agefbo
ESTROGEN
The influence on different doses of estrogen on systolic and diastolic blood pressure is investigated. 31 subjects are placed into 1 of three study types, and recives treatment two times for four weeks, with a two week washout period between.
Are there any significant differences between the different treatments - or carry over effects on blood pressure in the different groups?
Also an example of untidy data with values in column names.
Dimensions: Rows: 62 Columns: 22
Variable | LABEL | unit |
---|---|---|
Id | ID | |
std_typ | STUDY TYPE 1 = 0.625MG VS PLACEBO 2 = 1.25MG VS PLACEBO 3 = 1.25MG VS 0.625MG |
|
period | PERIOD | |
trtgrp | TREATMENT 1 = PLACEBO 2 = 0.625MG 3 = 1.25MG |
|
sysd1r1 | SYSTOLIC BP DAY 1 READING 1 | mmHg |
diasd1r1 | DIASTOLIC BP DAY 1 READING 1 | mmHg |
sysd1r2 | SYSTOLIC BP DAY 1 READING 2 | mmHg |
diasd1r2 | DIASTOLIC BP DAY 1 READING 2 | mmHg |
sysd1r3 | SYSTOLIC BP DAY 1 READING 3 | mmHg |
diasd1r3 | DIASTOLIC BP DAY 1 READING 3 | mmHg |
sysd2r1 | SYSTOLIC BP DAY 2 READING 1 | mmHg |
diasd2r1 | DIASTOLIC BP DAY 2 READING 1 | mmHg |
sysd2r2 | SYSTOLIC BP DAY 2 READING 2 | mmHg |
diasd2r2 | DIASTOLIC BP DAY 2 READING 2 | mmHg |
sysd2r3 | SYSTOLIC BP DAY 2 READING 3 | mmHg |
diasd2r3 | DIASTOLIC BP DAY 2 READING 3 | mmHg |
sysd3r1 | SYSTOLIC BP DAY 3 READING 1 | mmHg |
diasd3r1 | DIASTOLIC BP DAY 3 READING 1 | mmHg |
sysd3r2 | SYSTOLIC BP DAY 3 READING 2 | mmHg |
diasd3r2 | DIASTOLIC BP DAY 3 READING 2 | mmHg |
sysd3r3 | SYSTOLIC BP DAY 3 READING 3 | mmHg |
diasd3r3 | DIASTOLIC BP DAY 3 READING 3 | mmHg |
999: Missing bloodpressure data.
FEV
Data on 654 children seen in the Childhood Respiratory Disease Study6 in East Boston Massachusetts in 1980. Forced Expiratory Volume (FEV), an index of pulmonary function was measured, along with, age, height, sex and smoking status.
FEV is the volume of air (in liters) that can be expelled from the lungs in one second.
Usefull for demonstrating linear regressions, also multiple linear regression with categorical interactions, and change in FEV as a function of height and sex.
Dimensions: Rows: 654 Columns: 6
Variable | Description | Unit |
---|---|---|
Id | ID number | |
Age | Age | years |
FEV | FEV | l |
Hgt | Height | in |
Sex | Sex 0 = female 1 = male |
|
Smoke | Smoking Status 0 = non-current smoker 1 = current smoker |
FIELD
Data from a study of the ocular disease Retinitis pigmentosoa (RP). The condition can result in substantial loss of vision, in some cases complete blindness. It has been discovered that this disease is linked to two genes. Mutations in the rhodosin gene RHO account for many cases that are predominantly inherited, whereas mutations in the RPGR gene account for many sex-linked cases; only males can have the RPGR mutation. The study measures the visual field of approx. 100 patients in each group (RHO and RPGR). The field of vision is measured in °2.
Are there differences in the baseline level of visual field between the two groups? Does the rate of decline differ between the two groups?
Dimensions: Rows: 1326 Columns: 8
Variable | Description | unit |
---|---|---|
id | ID | |
group | group 1 = RHO 2 = RPGR |
|
age | age at visit | years |
gender | gender 1 = m 2 = f |
|
dtvisit | date of visit (month/day/year) | |
folowup | time from 1st visit | years |
totfldod | total field area right eye (OD) | °2 |
totfldos | total field area left eye (OS) | °2 |
Note: all RPGR individuals have to be male
HEART
A dataset showing the prevalence of different heart conditions, in different populations, and of different symptoms.
Dimensions: Rows: 7 Columns: 9
Variable | code |
---|---|
Diagnosis | Y1 = normal Y2 = atrial septal defect without pulmonary stenosis or pulmonary hypertension Y3 = ventricular septal defect with valvular pulmonary stenosis Y4 = isolated pulmonary hypertension Y5 = transposed great vessels Y6 = ventricular septal defect without pulmonary hypertension Y7 = ventricular septal defect with pulmonary hypertension |
Prevalence | Prevalence |
X1 | age 1-20 years old |
X2 | age>20 years old |
X3 | mild cyanosis |
X4 | easy fatigue |
X5 | chest pain |
X6 | repeated respiratory infections |
X7 | EKG axis more than 110 |
HORMONE
Data on the influence of four hormones, compared with a saline solution on the pancreatic and biliary secretions in laying hens. White leghorn hens, aged 14-29 weeks, were fitted with cannulas for collection of pancreatic and biliary secretions, and a jugular cannula for infusion of the hormones. One trial pr day was performed, until the jugular cannula stopped working - therefore there are a different number of observations pr. hen.
Dimensions: Rows: 398 Columns: 11
Variable | Description/Code | unit |
---|---|---|
ID | ID | |
Bilsecpr | Biliary secretion-pre | µl/min |
Bilphpr | Biliary pH-pre | pH |
Pansecpr | Pancreatic secretion-pre | µl/min |
Panphpr | Pancreatic pH-pre | pH |
Dose | Dose | APP: ng/mL plasma CKK, VIP, SEC: µg/kg/h |
Bilsecpt | Biliary secretion-post | µl/min |
Bilphpt | Biliary pH-post | pH |
Pansecpt | Pancreatic secretion-post | µl/min |
Panphpt | Pancreatic pH-post | pH |
Hormone | Hormone 1 = SAL 2 = APP 3 = CCK 4=SEC 5=VIP |
A value of 0 for pH indicate missing values.
- SAL: Saline
- APP: Avian pancreatic polypeptide
- CCK: Cholecystokinin
- SEC: Secretine
- VIP: Vasoactive intestinal peptide
HOSPITAL
The dataset is part of a larger set, collected on patients discharged from a hospital in Pennsylvania, as part of a study on use of antibiotic in hospitals.
Is the length of hospitalization affected on whether a patient received antibiotics?
Dimensions: Rows: 25 Columns: 9
Variable | Label | unit |
---|---|---|
Id | id no. | |
Dur_stay | Duration of hospital stay | |
Age | Age | |
Sex | Sex 1 = male 2 = female |
|
Temp | First temperature following admission | |
WBC | First WBC(x1000) following admission | |
Antibio | Received antibiotic 1 = yes 2 = no |
|
Bact_cul | Received bacterial culture 1 = yes 2 = no |
|
Service | Service 1 = med 2 = surg. |
WBC: White Bloodcell Count, an indicator of infection.
INFANTBP
Dimensions: Rows: 100 Columns: 18
Salt Taste Variables
Variable | Description | unit |
---|---|---|
ID | ||
Mn_sbp | Mean SBP 99.99=missing | |
Mn_dbp | Mean DBP 99.99=missing | |
MSB1slt | MSB-trial 1* water | |
MSB2slt | MSB-trial 2 water | |
MSB3slt | MSB-trial 3 0.1 molar salt + water | |
MSB4slt | MSB-trial 4 0.1 molar salt + water | |
MSB5slt | MSB-trial 5 water | |
MSB6slt | MSB-trial 6 water | |
MSB7slt | MSB-trial 7 0.3 molar salt + water | |
MSB8slt | MSB-trial 8 0.3 molar salt + water | |
MSB9slt | MSB-trial 9 water | |
MSB10slt | MSB-trial 10 water |
Sugar Taste Variables
Variable | Description |
---|---|
MSB1sug | MSB-trial 1 non-nutritive sucking |
MSB2sug | MSB-trial 2 water |
MSB3sug | MSB-trial 3 5% sucrose + water |
MSB4sug | MSB-trial 4 15% sucrose + water |
MSB5sug | MSB-trial 5 non-nutritive sucking |
- for MSB data 999.99 is a missing value; 0 indicates the baby did not suck.
LEAD
Dimensions: Rows: 124 Columns: 40
VARIABLE | DESCRIPTION |
---|---|
id | IDENTIFICATION NUMBER |
area | AREA - RESIDENCE ON AUG’72 1 = 0-1 MILES FROM SMELTER 2 = 1-2.5 MILES 3 = 2.5-4.1 MILES |
ageyrs | AGE in years |
sex | SEX 1 = MALE 2 = FEMALE |
IQ TEST RESULTS
VARIABLE | DESCRIPTION |
---|---|
iqv_inf | INF - INFORMATION SUBTEST IN WISC AND WPPSI |
iqv_comp | COMP - COMPREHENSION SUBTEST IN WISC AND WPPSI |
iqv_ar | AR - ARITHMETIC SUBTEST IN WISC AND WPPSI |
iqv_ds | DS - DIGIT SPAN SUBTEST(WISC) AND SENTENCE COMPLETION(WPPSI) |
iqv_raw | V/RAW - RAW SCORE/VERBAL IQ |
iqp_pc | PC - PICTURE COMPLETION SUBTEST IN WISC AND WPPSI |
iqp_bd | BD - BLOCK DESIGN SUBTEST IN WISC AND WPPSI |
iqp_oa | OA - OBJECT ASSEMBLY SUBTEST(WISC), ANIMAL HOUSE SUBTEST(WPPSI) |
iqp_cod | COD - CODING SUBTEST(WISC), GEOMETRIC DESIGN SUBTEST(WPPSI) |
iqp_raw | P/RAW - RAW SCORE/PERFORMANCE IQ (TOTAL OF SCORES PC, BD, OA, & COD) |
hh_index | HH/INDEX - HOLLINGSHEAD INDEX OF SOCIAL STATUS |
iqv | IQV - VERBAL IQ |
iqp | IQP - PERFORMANCE IQ |
iqf | IQF - FULL SCALE IQ (NOT SUM OR AVERAGE OF IQV D IQP) |
iq_type | TYPE OF IQ TEST 1 = WISC 2 = WPPSI (WISC USUALLY GIVEN TO CHILDREN GE 5 YRS 1 MONTH OF AGE WPPSI USUALLY GIVEN TO CHILDREN LE 5YRS OF AGE) |
lead_grp | GROUP - BLOOD LEAD LEVEL GROUP |
1=BLOOD LEAD LEVELS BELOW 40 MICROGRAMS/100ML IN | |
BOTH 1972 & 1973 (control group) | |
2=BLOOD LEAD LEVELS GREATER THAN OR EQUAL TO | |
40 MICROGRAMS/100ML | |
IN BOTH 72 & 73 OR A LEVEL GREATER THAN OR | |
EQUAL TO 40 | |
IN 73 ALONE (3 CASES ONLY) (currently exposed | |
Group) | |
3=BLOOD LEAD LEVELS GREATER THAN OR EQUAL TO | |
40 MICROGRAMS/100ML | |
IN 72 AND LESS THAN 40 IN 73 | |
(previously exposed group) | |
Group | 1=control group; 2=exposed group |
ld72 | LD72 - BLOOD LEAD VALUES (MICROGRAMS/100ML) IN72 |
MISSING=99 | |
ld73 | LD73 - BLOOD LEAD VALUES (MICROGRAMS/100ML) IN 73 |
fst2yrs | FST2YRS - DID CHILD LIVE FOR 1ST 2 YRS WITHIN |
1 MILE OF SMELTER 1=YES 2=NO | |
totyrs | TOTYRS - TOTAL NUMBER OF YEARS SPENT WITHIN |
4.1 MILES OF SMELTER |
SYMPTOM DATA (AS REPORTED BY PARENTS)
VARIABLE | DESCRIPTION |
---|---|
pica | PICA 1 = YES 2 = NO |
colic | COLIC 1 = YES 2 = NO |
clumsi | CLUMSINESS 1 = YES 2 = NO |
irrit | IRRITABILITY 1 = YES 2 = NO |
convul | CONVULSIONS 1 = YES 2 = NO |
CONTAIN NEUROLOGICAL TEST DATA
VARIABLE | DESCRIPTION |
---|---|
_2plat_r | # OF TAPS FOR RIGHT HAND IN THE 2-PLATE TAPPING |
TEST (#TAPS IN ONE 10 SECOND TRIAL) | |
MISSING=99 | |
_2plat_l | # OF TAPS FOR LEFT HAND IN THE 2-PLATE TAPPING TEST |
(#TAPS IN ONE 10 SECOND TRIAL) | |
MISSING=99 | |
visrea_r | VISUAL REACTION TIME RIGHT HAND (MILLISECONDS) |
MISSING=99 | |
visrea_l | VISUAL REATION TIME LEFT HAND (MILLISECONDS) |
MISSING=99 | |
audrea_r | AUDITORY REACTION TIME RIGHT HAND (MILLISECONDS) |
MISSING=99 | |
audrea_l | AUDITORY REACTION TIME LEFT HAND (MILLISECONDS) |
MISSING=99 | |
fwt_r | FINGER-WRIST TAPPING TEST RIGHT HAND |
(# TAPS IN ONE 10 SECOND TRIAL) | |
MISSING=99 | |
fwt_l | FINGER-WRIST TAPPING TEST LEFT HAND |
(#TAPS IN ONE 10 SECOND TRIAL) | |
MISSING=99 | |
hyperact | WWPS - WERRY-WEISS-PETERS SCALE FOR HYPERACTIVITY |
0=NO ACTIVITY . . . . 4=SEVERLY HYPERACTIVE | |
(AS REPORTED BY PARENTS) MISSING=99 | |
maxfwt | Finger-wrist tapping test in dominant hand(max of fwt_r,fwt_l) |
MICE
Dimensions: Rows: 240 Columns: 6
Variable | Description | unit |
---|---|---|
Id | ID | |
Group |
1 = RP 2 = NORMAL |
|
Trtgrp | TREATMENT GROUP A = LIGHT B = DIM C = DARK |
|
Age | AGE | days |
B_amp | B AMP | |
A_amp | A AMP |
9999 = missing.
NEPHRO
Data from a litterature study5 on nephrotoxicity of of several different aminoclycosides.
Note that this dataset is closely related to the datasets EFF and OTO, where EFF reports efficacy of the preparations, and OTO reports a combination of efficacy and side effects.
We get the sample size of patients in different studies, and the number of patients that experienced nephrotoxicity. Which antibiotic is best?
Dimensions: Rows: 72 Columns: 6
Variable | Description/Code |
---|---|
name | Study name |
id | Study number |
Endpnt | Endpoint 2=nephrotoxicity |
Antibio | Antibiotic
1=Amikacin 2=Gentamicin 3=Netilmicin 4=Sisomycin 5=Tobramycin |
Samp_sz | Sample size |
Side_eff | Number with side effects |
NIFED
Dimensions: Rows: 34 Columns: 10
Variable | Description | Code |
---|---|---|
Id | ID | |
trtgrp | Treatment group | N=nifedipine/P=placebo |
bashrtrt | Baseline heart rate* | beats/min |
lv1hrtrt | Level 1 heart rate+ | beats/min |
lv2hrtrt | Level 2 heart rate | beats/min |
lv3hrtrt | Level 3 heart rate | beats/min |
bassys | Baseline systolic bp* | mm Hg |
lv1sys | Level 1 systolic bp | mm Hg |
lv2sys | Level 2 systolic bp | mm Hg |
lv3sys | Level 3 systolic bp | mm Hg |
Immediately prior to randomization.
Highest heart rate and systolic blood pressure at baseline and each level of therapy respectively.
Values of 999 indicates that either
the patient withdrew from the study prior to entering this level of therapy
the patient achieved pain relief prior to reaching this level or therapy,
the patient encountered this level of therapy, but this particular piece of data was missing.
OTO
Data from a litterature study5 on nephro- and ototoxicity and efficacy of of several different aminoclycosides.
Note that this dataset is closely related to the datasets EFF and NEPHRO, where EFF reports efficacy of the preparations, and NEPHRO reports on nephrotoxicity.
We get the sample size of patients in different studies, and the number of patients that experienced sideeffects. Which antibiotic is best?
Dimensions: Rows: 50 Columns: 6
Variable | Description/Code |
---|---|
Name | Study Name |
Id | Study Number |
Endpnt | Endpoint 1 = efficacy 2 = nephrotoxicity 3 = ototoxicity |
Antibio | Antibiotic 1 = Amikacin 2 = Gentamicin 3 = Netilmicin 4 = Sisomycin 5 = Tobramycin |
Samp_sz | Sample Size |
Side_eff | Number with side effect |
PIRIFORM
Dimensions: Rows: 631 Columns: 5
Variable | Code |
---|---|
ID | |
piriform | Piriformis Syndrome 1 = Negative 2 = Positive |
sex | Sex 1 = Male 2 = Female |
Age | |
maxchg | Max change between tibia and peroneal |
SEXRAT
Frequencies of different sex orders of the first 5 children born in families.
Does the probability of a male birth differ from 50%?
Are the sex distribution of successive offspring independent? Ie, does the sex of the first born child, affect the probability of the second child?
Dimensions: Rows: 60 Columns: 8
Variable | code |
---|---|
nm_chld+ | Number of children |
sx_1 | Sex of 1st born |
sx_2 | Sex of 2nd born |
sx_3 | Sex of 3rd born |
sx_4 | Sex of 4th born |
sx_5 | Sex of 5th born |
sexchldn* | Sex of all children |
num_fam** | Number of families |
For families with 5+ children, the sex of the first 5 children are listed. The number of children is given as 5 for such families.
The sex of successive births is given. Thus, MMMF means that the first three children were males and the fourth child was a female. There were 484 such families.
** Number of families with specific gender contribution of children
Example; there are:
- 4400 families with 2 children where both children are male,
- 4270 families with 2 children where the first child is male, and the second female and,
- 4633 families with 2 children where the first child is female and the second male.
Compare P(child 2 is male | child 1 is female) with P(child 2 is male | child 1 is male)
That is, the probability child 2 is male given that child 1 is female.
R
sexrat <- read_csv("https://raw.githubusercontent.com/KUBDatalab/R-toolbox/main/episodes/data/SEXRAT.csv")
# Number of families with female first child:
sexrat %>%
filter(sx_1 == "F") %>%
summarise(nF1 = sum(num_fam))
# A tibble: 1 x 1
nF1
<dbl>
1 25719
# Number of those families with a male second child:
sexrat %>%
filter(sx_1 == "F",
sx_2 == "M") %>%
summarise(nF1M2 = sum(num_fam))
# A tibble: 1 x 1
nF1M2
<dbl>
1 12882
# Point estimate for probability of child 2 being male, given child 1 is female:
pF1M2 <- 12882/25719
pF1M2
[1] 0.5008748
# Standard error of mean for proportions:
SEM_F1M2 <- sqrt(pF1M2*(1-pF1M2)/25719)
SEM_F1M2
[1] 0.003117757
# That gives us a 95% confidence interval for P(Child 2 is male | Child 1 is female):
pF1M2 + c(-1,1)*1.96*SEM_F1M2
[1] 0.4947640 0.5069856
# Doing the same calculations for P(Child 2 is male | Child 1 is male)
# gives us an interval of (rounded):
[1] 0.512 0.524
# Which would indicate the having a male child first, increases the probability
# of having a second male child.
SMOKE
Dimensions: Rows: 234 Columns: 8
Variable | Code | unit |
---|---|---|
ID | ID number | |
Age | age | |
Gender | Gender 1 = male 2 = female |
|
Cig_day | Cigarettes/day | |
CO | Carbon monoxide (CO) (X 10) | |
Min_last | Minutes elapsed since last cigarette | |
LogCOadj | Log CO Adj * (X 1000) | |
Day_abs | Days abstinent Those abstinent less than 1 day were given a value of zero. |
999 and 9999 = missing values
- This variable represents adjusted carbon monoxide (CO) values. CO values were adjusted for minutes elapsed since last cigarette smoked using the formula Log 10 CO (Adjusted) = Log 10 CO - (-0.000638) X (Min - 80), where Min is the number of minutes elapsed since the last cigarette smoked.
SWISS
Data from the Swiss Analgesic Study, done to assess the effect on renal function and other health parameters, taking different phenacity based analgesics.
In this part of the dataset, we get measurements of serum creatinine for different years.
624 women from workplaces near Basel with a high intake of phenacitin-based analgesics. This was the study-group. 626 women from the same workplaces, with a low, or non-existent intake of these analgecis, was studied as the control-group.
The study group was, based on the level of NAPAP (N-acetyl-P-aminophenyl) in their blood samples, divided into high and low level sub-groups. Both subgroups had higher NAPAP levels than the control
A base-line measurement of serum creatinine was taken in 1967-68, and followups were done in 1969-1978.
Dimensions: Rows: 300 Columns: 10
Variable | Codes | unit |
---|---|---|
ID | ID | |
age | age | years |
group | Group 1 = High NAPAP 2 = Low NAPAP 3 = control |
|
creat_68 | Serum Creatinine 1968 | (mg/dL) |
creat_69 | Serum Creatinine 1969 | (mg/dL) |
creat_70 | Serum Creatinine 1970 | (mg/dL) |
creat_71 | Serum Creatinine 1971 | (mg/dL) |
creat_72 | Serum Creatinine 1972 | (mg/dL) |
creat_75 | Serum Creatinine 1975 | (mg/dL) |
creat_78 | Serum Creatinine 1978 | (mg/dL) |
For all creat_xx
: 9.99 indicates missing data,
i.e. NA-values.
TEAR
Dimensions: Rows: 14 Columns: 61
Variable | Code |
---|---|
ID | |
od3bas1 | OD 3sec baseline 1 |
od3bas2 | OD 3 sec baseline 2 |
od3im1 | OD 3 sec immediately post 1 |
od3im2 | OD 3 sec immediately post 2 |
od3pst51 | OD 3 sec 5min post 1 |
od3pst52 | OD 3 sec 5min post 2 |
od3pt101 | OD 3 sec 10min post 1 |
od3pt102 | OD 3 sec 10min post 2 |
od3pt151 | OD 3 sec 15min post 1 |
od3pt152 | OD 3 sec 15min post 2 |
os3bas1 | OS 3sec baseline 1 |
os3bas2 | OS 3 sec baseline 2 |
os3im1 | OS 3 sec immediately post 1 |
os3im2 | OS 3 sec immediately post 2 |
os3pst51 | OS 3 sec 5min post 1 |
os3pst52 | OS 3 sec 5min post 2 |
os3pt101 | OS 3 sec 10min post 1 |
os3pt102 | OS 3 sec 10min post 2 |
os3pt151 | OS 3 sec 15min post 1 |
os3pt152 | OS 3 sec 15min post 2 |
od6bas1 | OD 6 sec baseline 1 |
od6bas2 | OD 6 sec baseline 2 |
od6im1 | OD 6 sec immediately post 1 |
od6im2 | OD 6 sec immediately post 2 |
od6pst51 | OD 6 sec 5min post 1 |
od6pst52 | OD 6 sec 5min post 2 |
od6pt101 | OD 6 sec 10min post 1 |
od6pt102 | OD 6 sec 10min post 2 |
od6pt151 | OD 6 sec 15min post 1 |
od6pt152 | OD 6 sec 15min post 2 |
os6bas1 | OS 6 sec baseline 1 |
os6bas2 | OS 6 sec baseline 2 |
os6im1 | OS 6 sec immediately post 1 |
os6im2 | OS 6 sec immediately post 2 |
os6pst51 | OS 6 sec 5min post 1 |
os6pst52 | OS 6 sec 5min post 2 |
os6pt101 | OS 6 sec 10min post 1 |
os6pt102 | OS 6 sec 10min post 2 |
os6pt151 | OS 6 sec 15min post 1 |
os6pt152 | OS 6 sec 15min post 2 |
od10bas1 | OD 10 sec baseline 1 |
od10bas2 | OD 10 sec baseline 2 |
od10im1 | OD 10 sec immediately post 1 |
od10im2 | OD 10 sec immediately post 2 |
od10ps51 | OD 10 sec 5min post 1 |
od10ps52 | OD 10 sec 5min post 2 |
od10p101 | OD 10 sec 10min post 1 |
od10p102 | OD 10 sec 10min post 2 |
od10p151 | OD 10 sec 15min post 1 |
od10p152 | OD 10 sec 15min post 2 |
os10bas1 | OS 10 sec baseline 1 |
os10bas2 | OS 10 sec baseline 2 |
os10im1 | OS 10 sec immediately post 1 |
os10im2 | OS 10 sec immediately post 2 |
os10ps51 | OS 10 sec 5min post 1 |
os10ps52 | OS 10 sec 5min post 2 |
os10p101 | OS 10 sec 10min post 1 |
os10p102 | OS 10 sec 10min post 2 |
os10p151 | OS 10 sec 15min post 1 |
os10p152 | OS 10 sec 15min post 2 |
TEMPERAT
Dimensions: Rows: 630 Columns: 6
Variable | LABEL | unit |
---|---|---|
Date | DATE (MDY) | |
Out_temp | OUTSIDE TEMERATURE | °F |
Room | ROOM LOCATION | |
In_temp | INSIDE TEMPERATURE | °F |
Cor_fac | CORRECTION FACTOR ADDED 1 = YES 0 = NO) |
|
Typ_wea | TYPE OF WEATHER 1 = SUNNY 2 = PARTLY CLOUDY 3 = CLOUDY 4 = RAINY 5 = FOGGY 9 = MISSING |
TENNIS1
Dimensions: Rows: 444 Columns: 12
VARIABLE | VARIABLE NAME | unit |
---|---|---|
Id | ID | |
Age | AGE 99=MISSING | |
Sex | SEX | |
1 = MALE | ||
2 = FEMALE | ||
Num_epis | NUMBER OF EPISODES OF TENNIS ELBOW 9=MISSING | |
Typ_last | TYPE OF RACQUET USED DURING LAST EPISODE | |
1 = CONVENTIONAL SIZE | ||
2 = MID-SIZE | ||
3 = OVER-SIZE | ||
9 = MISSING | ||
Wgt_last | WEIGHT OF RACQUET USED DURING LAST EPISODE | |
1=HEAVY | ||
2=MEDIUM | ||
3=LIGHT | ||
4=DON’T KNOW | ||
9=MISSING | ||
Mat_last | MATERIAL OF RACQUET USED DURING LAST EPISODE | |
1=WOOD | ||
2=ALUMINUM | ||
3=FIBERGLASS AND COMPOSITE | ||
4=GRAPHITE | ||
5=STEEL | ||
6=COMPOSITE | ||
7=OTHER | ||
9=MISSING | ||
Str_last | STRING TYPE OF RACQUET USED DURING LAST EPISODE | |
1=NYLON | ||
2=GUT | ||
3=DON’T KNOW | ||
9=MISSING | ||
Typ_curr | TYPE OF RACQUET USED CURRENTLY | |
1 = CONVENTIONAL SIZE | ||
2 = MID-SIZE | ||
3 = OVER-SIZE | ||
9 = MISSING | ||
Wgt_curr | WEIGHT OF RACQUET USED CURRENTLY | |
1 = HEAVY | ||
2 = MEDIUM | ||
3 = LIGHT | ||
4 = DON’T KNOW | ||
9 = MISSING | ||
Mat_curr | MATERIAL OF RACQUET USED CURRENTLY | |
1 = WOOD | ||
2 = ALUMINUM | ||
3 = FIBERGLASS AND COMPOSITE | ||
4 = GRAPHITE | ||
5 = STEEL | ||
6 = COMPOSITE | ||
7 = OTHER | ||
9 = MISSING | ||
Str_curr | STRING TYPE OF RACQUET USED CURRENTLY | |
1 = NYLON | ||
2 = GUT | ||
3 = DON’T KNOW | ||
9 = MISSING |
TENNIS2
Dimensions: Rows: 88 Columns: 16
VARIABLE | PERIOD* | VARIABLE NAME |
---|---|---|
id | ID | |
age | AGE | |
sex | SEX | |
1 = MALE | ||
2 = FEMALE | ||
9 = MISSING | ||
drg_ord | DRUG ORDER | |
1 = MOTRIN-PLACEBO | ||
2 = PLACEBO-MOTRIN | ||
painmx_2 | 2 | DURING STUDY PERIOD, PAIN DURING MAXIMUM ACTIVITY VS |
BAESLINE | ||
1 = WORST | ||
2 = UNCHANGED | ||
3 = SLIGHTLY IMPROVED (25%) | ||
4 = MODERATELY IMPROVED (50%) | ||
5 = MOSTLY IMPROVED (75%) | ||
6 = COMPLETELY IMPROVED | ||
9 = MISSING | ||
pain12_2 | 2 | WITHIN 12 HOURS FOLLOWING MAXIMAL ACTIVITY, COMPARED TO |
SAME PERIOD AT BASELINE (SAME CODE AS painmx_2) | ||
painav_2 | 2 | DURING THE AVERAGE DAY OF STUDY PERIOD PAIN VS. BASELINE |
(SAME CODE AS painmx_2) | ||
painov_2 | 2 | OVERALL IMPRESSION OF DRUG EFFICACY VS. BASELINE (SAME CODE AS painmx_2) |
painmx_3 | 3 | DURING STUDY PERIOD, PAIN DURING MAXIMUM ACTIVITY VS |
BASELINE (SAME CODE AS painmx_2) | ||
pain12_3 | 3 | WITHIN 12 HOURS FOLLOWING MAXIMAL ACTIVITY, COMPARED TO |
SAME PERIOD AT BASELINE (SAME CODE AS painmx_2) | ||
painav_3 | 3 | DURING THE AVERAGE DAY OF STUDY PERIOD PAIN VS BASELINE |
(SAME CODE AS painmx_2) | ||
painov_3 | 3 | OVERALL IMPRESSION OF DRUG EFFICACY VS BASELINE |
(SAME CODE AS painmx_2) | ||
painmx_4 | 4 | DURING STUDY PERIOD, PAIN DURING MAXIMUM ACTIVITY VS |
BASELINE (SAME CODE AS painmx_2) | ||
pain12_4 | 4 | WITHIN 12 HOURS FOLLOWING MAXIMAL ACTIVITY, COMPARED TO |
SAME PERIOD AT BASELINE (SAME CODE AS painmx_2) | ||
painav_4 | 4 | DURING THE AVERAGE DAY OF STUDY PERIOD PAIN VS BASELINE |
(SAME CODE AS painmx_2) | ||
painov_4 | 4 | OVERALL IMPRESSION OF DRUG EFFICACY VS BASELINE |
(SAME CODE AS painmx_2) |
-
PERIOD 2 = PAIN SCORES AFTER THE FIRST ACTIVE DRUG PERIOD COMPARED WITH BASELINE
PERIOD 3 = PAIN SCORES AFTER THE WASHOUT PERIOD COMPARED WITH BASELINE
PERIOD 4 = PAIN SCORES AFTER THE SECOND ACTIVE DRUG PERIOD COMPARED WITH BASELINE
VALID
Dimensions: Rows: 173 Columns: 9
Variable | Description | unit |
---|---|---|
Id | ID number | |
sfat_dr | Saturated fat-DR | |
sfat_ffq | Saturated fat-FFQ | |
tfat_dr | Total fat-DR | |
tfat_ffq | Total fat-FFQ | |
alco_dr | Alcohol consumption-DR | |
alco_ffq | Alcohol consumption-FFQ | |
cal_dr | Total calories-DR | |
cal_ffq | Total calories-FFQ |
spermatozoa
Data from a 7 year longitudinal study started in spring 1975 in Edinburgh. Two classes at an elementary school participated and informed consent were collected from 40 of the 42 boys in the classes. Every 3 months a 24 hour urine sample were collected from each boy. These samples were analyzed for the precense of spermatozoa. In the variable observations are registered the result of each urine sample.
The purpose of the study was to determine age of spermache.
The format of observations is unusual and suitable for cleaning exercises.
Det kan godt være vi lige skal kigge på:For further details about the study see Nielsen et al. (1986a, 1986b). mere præcist: Nielsen, C. T., Skakkebxk, N. E., Darling, J. A. B., Hunter, W. M., Richardson, D. W., Jorgensen, M., and Keiding, N. (1986a). Longitudinal study of testosterone and luteinizing hormone (LH) in relation to spermarche, pubic hair, height and sitting height in normal boys. Acta Endocrinologica 113, Supplementum 279, 98-106.
Nielsen, C. T., Skakkebxk, N. E., Richardson, D. W., Darling, J. A. B., Hunter, W. M., J0rgensen, M., Nielsen, A., Ingerslev, O., Keiding, N., and Muller, J. (1986b). Onset of the release of spermatozoa (spermarche) in boys in relation to age, testicular growth, pubic hair and height. Journal of Clinical Endocrinology and Metabolism 62, 532–535.
Dimensions: Rows: 40 Columns: 5
Også Rosner…
Variable | Description | unit |
---|---|---|
boy | ID | |
first_positive | Age at first spermatozoa-positive urine sample | years |
entry | Age at start of study | years |
exit | Age at end of study (eg. exit from study) | years |
observations | Spermatozoa-positive urine samples | |
+ = positive | ||
- = negative |
who
Dimensions: Rows: 405440 Columns: 10
NB: Filen er semikolon-separeret
Variable | Description | Unit |
---|---|---|
country | Landenavn | |
iso2 | ISO2 countrycode | |
iso3 | ISO3 countrycode | |
year | year | XXXX |
new | Artefakt fra databehandling | Alle felter er “new” |
diag | Diagnostisk metode | * |
sex | Sex m = male f = female |
|
age_low | Nedre aldersinterval | year |
age_high | Øvre aldersinterval | year |
value | Antal observerede tilfælde af TB |
) diagnostisk metode sp positive pulmonary smear * ne Negative pulmonary smear * ep extrapulmonary * relapse
Wine
A data set containing the results of a chemical analysis of three different cultivars (variety of grape), from the same region in Italy. We are provided with 13 different quantities.
Usable for PCA and RDA. Note that this data set does not have column-names.
Dimensions: Rows: 178 Columns: 14
Variable | Description | Unit |
---|---|---|
1 | Cultivar | |
2 | Alcohol | % |
3 | Malic acid | g/L |
4 | Ash | g/L |
5 | Alcalinity of ash | meq/L (milliequivalents per liter) |
6 | Magnesium | mg/L |
7 | Total phenols g/L | |
8 | Flavanoids | g/L |
9 | Nonflavanoid phenols | g/L |
10 | Proanthocyanins | g/L |
11 | Colour intensity | Absorbance |
12 | Hue | Absorbance-ratio |
13 | OD280/OD315 of diluted wines | Absorbance-ratio |
14 | Proline | mg/L |
Absorbance is measured as the sum of absorbance-units at 420, 520 and 620 nm (blue, green and red light respectively, measuring the yellow, red, and blue colours of the wine.)
Hue is measured as absorbance at 420 nm divided by absorbance at 520 nm.
OD280/OD315 is measured as absorbance at 280 nm divided by absorbance at 315 nm.
References
1: Rosner, Bernard A. Fundamentals of Biostatistics, 7/e, International Edition, 2011 ISBN: 9780538735896. https://www.cengage.com/cgi-wadsworth/course_products_wp.pl?fid=M20b&product_isbn_issn=9780538733496&token
der er også guf her https://www.doc88.com/p-5925003681540.html
https://statanaly.com/wp-content/uploads/2023/04/Fundamentals-of-Biostatistics-7th-Edition.pdf
2: Hopper, J.H. & Seeman, E (1994). The bone density of female twins discordant for tobacco use. New England Journal of Medicine, 330, 387-392.
3: Mandel, E., Bluestone, C.D., Rockette, H.E., Blatter, M.M., Reisinger, K.S., Wucher, E.P. & Harper, J. 1982, Duration of effusion after antibiotic treatment for acute otitis media: Comparison of cefaclor and amoxicillin. Pediatric Infections Diseases, 1, 310-316.
4: Jorgensen, Merete; Keiding, Niels; Skakkebaek, Niels Erik. Estimation of Spermarche from Longitudinal Spermaturia Data Biometrics, Vol. 47, No. 1 (Mar., 1991), pp. 177-193 https://doi.org/10.2307/2532505 https://www.jstor.org/stable/2532505
5: Buring, J.E, Evans, D.A., Mayrent, S.L. Rosner, B. Colton, T & Hennekens, C.H. (1988). Randomized trials of aminoglycoside antibiotics. Reviews of Infectious Disease, 10(5) 951-957.
6: Tage, I.B., Weiss, S.T., Rosner, B. & Speizer, F.E. (1979), Effect of parental cigarette smoking on pulmonary function in children. American Journal of Epidemiology, 110, 15-26.
7: https://www.who.int/teams/global-tuberculosis-programme/data
8: Townsend, T.R., Shapiro, M. Rosner, B. & Kass, E.H. (1979) Use of antimicrobial drugs in general hospitals. I. Description for population and definition of methods. Journal of Infetious Diseases, 139(6), 688-697.
9: Aeberhard, S. & Forina, M. 1991, UCI Machine Learning Repository, https://doi.org/10.24432/C5PC7J
10: Stoet, G. & Geary, D.C. 2019, A simplified approach to measuring national gender inequality, PLOS ONE, 14(1), 1-18, https://doi.org/10.1371/journal.pone.0205349
List of datasets not sufficiently documented yet
Der hakkes af efterhånden som de er færdige - og så er issue 113 done.
- infantbp
- lead
- mice
- nephro
- nifed
- oto
- piriform
- smoke
- tear
- temperat
- tennis1
- tennis2
- valid
- spermatozoa
- who