::libraries(
easypackages# Data i/o
"here", # relative file path
"rio", # file import-export
# Data manipulation
"janitor", # data cleaning fns
"haven", # stata, sas, spss data io
"labelled", # var labelling
"readxl", # excel sheets
# "scales", # to change formats and units
"skimr", # quick data summary
"broom", # view model results
# Analysis output
"gt",
# "modelsummary", # output summary tables
"gtsummary", # output summary tables
"flextable", # creating tables from objects
"officer", # editing in office docs
# R graph related packages
"ggstats",
"RColorBrewer",
# "scales",
"patchwork",
# Misc packages
"tidyverse", # Data manipulation iron man
"tictoc" # Code timing
)
South Asia DHS variable checklist
Getting started
Here we show the pre-requisite code sections. Run these at the outset to avoid errors. First we load the required packages.
Next we turn off scientific notations.
options(scipen = 999)
Next we set the default gtsummary print engine for tables.
theme_gtsummary_printer(print_engine = "flextable")
Now we set the flextable output defaults.
set_flextable_defaults(
font.size = 11,
text.align = "left",
big.mark = "",
background.color = "white",
table.layout = "autofit",
theme_fun = theme_vanilla
)
Introduction
On this page we show the variable checklist of South Asian Demographic and Health Survey (DHS) datasets. This will be the primary document for viewing what variables are available for use in the DHS datasets in South Asia and across the countries. We check the variable availability from the raw data dictionaries and after that run the data pooling code.
Across all the checklists, we have dataset name in the table header row, variable label in the table col and variable name as records. If a variable is available in the dataset we give the variable name and if it’s unavailable we keep the records as missing.
Afghanistan variable checklist
The variable availability for Afghanistan is given below.
pos | Variable labels | AFDHS 2015 |
---|---|---|
1 | ID and Weight variables | |
2 | Women's ind sample weight | v005 |
3 | Primary sampling unit | v021 |
4 | Sample stratum number | v022 |
5 | Strata used in sample design | v023 |
6 | Cluster number | v001 |
7 | Household number | v002 |
8 | Respondent's line number | v003 |
9 | Birth order | bord |
10 | Subnational region variables | |
11 | Region/Division/State | v024 |
12 | District of residence | sdistrict |
13 | Ecological region | |
14 | Birth history variables | |
15 | Survival status of child | b5 |
16 | Child's age at death (cmc) | b7 |
17 | Child is twin | b0 |
18 | Child's month of birth | b1 |
19 | Child's year of birth | b2 |
20 | Child's date of birth (cmc) | b3 |
21 | Child's sex | b4 |
22 | Preceding birth interval | b11 |
23 | Mother-level variables | |
24 | Mother's date of birth (cmc) | v011 |
25 | Highest education level | v106 |
26 | Current marital status | v501 |
27 | Partner's education level | v701 |
28 | Mother's anemia level | |
29 | Respondent's weight in kg | |
30 | Respondent's height in cm | |
31 | Household-level variables | |
32 | Sex of household head | v151 |
33 | Age of household head | v152 |
34 | Wealth index quintile | v190 |
35 | Social group variables | |
36 | Religion | |
37 | Ethnicity | v131 |
38 | Caste | |
39 | Language of questionnaire | |
40 | Language of interview | v045b |
41 | Native language of respondent | v045c |
42 | Family-structure variables | |
43 | Relationship to hh head | hv101 |
44 | De jure residents | hv102 |
45 | De facto residents | hv103 |
46 | Sex of hh member | hv104 |
47 | Age of hh member | hv105 |
48 | Highest education of hh member | hv106 |
49 | Community-level variables | |
50 | Distance to healthcare facility | v467 |
51 | Covered by health insurance | v481 |
52 | Cluster altitude |
Bangladesh variable checklist
The variable availability for Bangladesh is given below.
pos | Variable labels | BDDHS 1993 | BDDHS 1996 | BDDHS 1999 | BDDHS 2004 | BDDHS 2007 | BDDHS 2011 | BDDHS 2014 | BDDHS 2017 | BDDHS 2022 |
---|---|---|---|---|---|---|---|---|---|---|
1 | ID and Weight variables | |||||||||
2 | Women's ind sample weight | v005 | v005 | v005 | v005 | v005 | v005 | v005 | v005 | v005 |
3 | Primary sampling unit | v021 | v021 | v021 | v021 | v021 | v021 | v021 | v021 | v021 |
4 | Sample stratum number | v022 | v022 | v022 | v022 | v022 | v022 | v022 | v022 | v022 |
5 | Strata used in sample design | v023 | v023 | v023 | v023 | v023 | v023 | v023 | v023 | v023 |
6 | Cluster number | v001 | v001 | v001 | v001 | v001 | v001 | v001 | v001 | v001 |
7 | Household number | v002 | v002 | v002 | v002 | v002 | v002 | v002 | v002 | v002 |
8 | Respondent's line number | v003 | v003 | v003 | v003 | v003 | v003 | v003 | v003 | v003 |
9 | Birth order | bord | bord | bord | bord | bord | bord | bord | bord | bord |
10 | Subnational region variables | |||||||||
11 | Region/Division/State | v024 | v024 | v024 | v024 | v024 | v024 | v024 | v024 | v024 |
12 | District of residence | sdistr | sdist | sdist | ||||||
13 | Ecological region | |||||||||
14 | Birth history variables | |||||||||
15 | Survival status of child | b5 | b5 | b5 | b5 | b5 | b5 | b5 | b5 | b5 |
16 | Child's age at death (cmc) | b7 | b7 | b7 | b7 | b7 | b7 | b7 | b7 | b7 |
17 | Child is twin | b0 | b0 | b0 | b0 | b0 | b0 | b0 | b0 | b0 |
18 | Child's month of birth | b1 | b1 | b1 | b1 | b1 | b1 | b1 | b1 | b1 |
19 | Child's year of birth | b2 | b2 | b2 | b2 | b2 | b2 | b2 | b2 | b2 |
20 | Child's date of birth (cmc) | b3 | b3 | b3 | b3 | b3 | b3 | b3 | b3 | b3 |
21 | Child's sex | b4 | b4 | b4 | b4 | b4 | b4 | b4 | b4 | b4 |
22 | Preceding birth interval | b11 | b11 | b11 | b11 | b11 | b11 | b11 | b11 | b11 |
23 | Mother-level variables | |||||||||
24 | Mother's date of birth (cmc) | v011 | v011 | v011 | v011 | v011 | v011 | v011 | v011 | v011 |
25 | Highest education level | v106 | v106 | v106 | v106 | v106 | v106 | v106 | v106 | v106 |
26 | Current marital status | v501 | v501 | v501 | v501 | v501 | v501 | v501 | v501 | v501 |
27 | Partner's education level | v701 | v701 | v701 | v701 | v701 | v701 | v701 | v701 | v701 |
28 | Mother's anemia level | v457 | ||||||||
29 | Respondent's weight in kg | v437 | v437 | v437 | v437 | v437 | v437 | v437 | ||
30 | Respondent's height in cm | v438 | v438 | v438 | v438 | v438 | v438 | v438 | ||
31 | Household-level variables | |||||||||
32 | Sex of household head | v151 | v151 | v151 | v151 | v151 | v151 | v151 | v151 | v151 |
33 | Age of household head | v152 | v152 | v152 | v152 | v152 | v152 | v152 | v152 | v152 |
34 | Wealth index quintile | v190 | v190 | v190 | v190 | v190 | v190 | |||
35 | Social group variables | |||||||||
36 | Religion | v130 | v130 | v130 | v130 | v130 | v130 | v130 | v130 | v130 |
37 | Ethnicity | |||||||||
38 | Caste | |||||||||
39 | Language of questionnaire | v045a | ||||||||
40 | Language of interview | v045b | ||||||||
41 | Native language of respondent | v045c | ||||||||
42 | Family-structure variables | |||||||||
43 | Relationship to hh head | hv101 | hv101 | hv101 | hv101 | hv101 | hv101 | hv101 | hv101 | hv101 |
44 | De jure residents | hv102 | hv102 | hv102 | hv102 | hv102 | hv102 | hv102 | hv102 | hv102 |
45 | De facto residents | hv103 | hv103 | hv103 | hv103 | hv103 | hv103 | hv103 | hv103 | hv103 |
46 | Sex of hh member | hv104 | hv104 | hv104 | hv104 | hv104 | hv104 | hv104 | hv104 | hv104 |
47 | Age of hh member | hv105 | hv105 | hv105 | hv105 | hv105 | hv105 | hv105 | hv105 | hv105 |
48 | Highest education of hh member | hv106 | hv106 | hv106 | hv106 | hv106 | hv106 | hv106 | hv106 | hv106 |
49 | Community-level variables | |||||||||
50 | Distance to healthcare facility | |||||||||
51 | Covered by health insurance | v481 | v481 | |||||||
52 | Cluster altitude | v040 | v040 | v040 |
NOTE:
- In Bangladesh DHS 1993, 1996 and 1996 the wealth index variables is available in the HR files.
India variable checklist
The variable availability for India is given below.
pos | Variable labels | IADHS 1992 | IADHS 1998 | IADHS 2005 | IADHS 2015 | IADHS 2019 |
---|---|---|---|---|---|---|
1 | ID and Weight variables | |||||
2 | Women's ind sample weight | v005 | v005 | v005 | v005 | v005 |
3 | Primary sampling unit | v021 | v021 | v021 | v021 | v021 |
4 | Sample stratum number | v022 | v022 | v022 | v022 | v022 |
5 | Strata used in sample design | v023 | v023 | v023 | v023 | v023 |
6 | Cluster number | v001 | v001 | v001 | v001 | v001 |
7 | Household number | v002 | v002 | v002 | v002 | v002 |
8 | Respondent's line number | v003 | v003 | v003 | v003 | v003 |
9 | Birth order | bord | bord | bord | bord | bord |
10 | Subnational region variables | |||||
11 | Region/Division/State | v024 | v024 | v024 | v024 | v024 |
12 | District of residence | shdist | sdist | shdistri | shdist | |
13 | Ecological region | |||||
14 | Birth history variables | |||||
15 | Survival status of child | b5 | b5 | b5 | b5 | b5 |
16 | Child's age at death (cmc) | b7 | b7 | b7 | b7 | b7 |
17 | Child is twin | b0 | b0 | b0 | b0 | b0 |
18 | Child's month of birth | b1 | b1 | b1 | b1 | b1 |
19 | Child's year of birth | b2 | b2 | b2 | b2 | b2 |
20 | Child's date of birth (cmc) | b3 | b3 | b3 | b3 | b3 |
21 | Child's sex | b4 | b4 | b4 | b4 | b4 |
22 | Preceding birth interval | b11 | b11 | b11 | b11 | b11 |
23 | Mother-level variables | |||||
24 | Mother's date of birth (cmc) | v011 | v011 | v011 | v011 | v011 |
25 | Highest education level | v106 | v106 | v106 | v106 | v106 |
26 | Current marital status | v501 | v501 | v501 | v501 | v501 |
27 | Partner's education level | v701 | v701 | v701 | v701 | v701 |
28 | Mother's anemia level | |||||
29 | Respondent's weight in kg | |||||
30 | Respondent's height in cm | |||||
31 | Household-level variables | |||||
32 | Sex of household head | v151 | v151 | v151 | v151 | v151 |
33 | Age of household head | v152 | v152 | v152 | v152 | v152 |
34 | Wealth index quintile | v190 | v190 | v190 | ||
35 | Social group variables | |||||
36 | Religion | v130 | v130 | v130 | v130 | v130 |
37 | Ethnicity | v131 | v131 | v131 | v131 | |
38 | Caste | shcaste | sh40/sh41 | sh45/sh46 | sh35/sh36 | sh48/sh49 |
39 | Language of questionnaire | slangqst | slangq | v045a | ||
40 | Language of interview | slangint | slangi | v045b | ||
41 | Native language of respondent | slangrsp | slanguag | slang | slangrm | v045c |
42 | Family-structure variables | |||||
43 | Relationship to hh head | hv101 | hv101 | hv101 | hv101 | hv101 |
44 | De jure residents | hv102 | hv102 | hv102 | hv102 | hv102 |
45 | De facto residents | hv103 | hv103 | hv103 | hv103 | hv103 |
46 | Sex of hh member | hv104 | hv104 | hv104 | hv104 | hv104 |
47 | Age of hh member | hv105 | hv105 | hv105 | hv105 | hv105 |
48 | Highest education of hh member | hv106 | hv106 | hv106 | hv106 | hv106 |
49 | Community-level variables | |||||
50 | Distance to healthcare facility | |||||
51 | Covered by health insurance | v481 | v481 | |||
52 | Cluster altitude | saltitud | v040 | v040 | v040 |
NOTE: Regarding the variable names and their corresponding datasets
- In India DHS 1992 caste variable is available only in the HR and PR files.
- Across all India DHS rounds ethnicity variable is similar to caste.
- The language-related variables are in the IR and BR files.
- In India DHS 1992 and 1998 the wealth index variables is available in the HR files.
Maldives variable checklist
The variable availability for Maldives is given below.
pos | Variable labels | MVDHS 2009 | MVDHS 2016 |
---|---|---|---|
1 | ID and Weight variables | ||
2 | Women's ind sample weight | v005 | v005 |
3 | Primary sampling unit | v021 | v021 |
4 | Sample stratum number | v022 | v022 |
5 | Strata used in sample design | v023 | v023 |
6 | Cluster number | v001 | v001 |
7 | Household number | v002 | v002 |
8 | Respondent's line number | v003 | v003 |
9 | Birth order | bord | bord |
10 | Subnational region variables | ||
11 | Region/Division/State | v024 | v024 |
12 | District of residence | ||
13 | Ecological region | ||
14 | Birth history variables | ||
15 | Survival status of child | b5 | b5 |
16 | Child's age at death (cmc) | b7 | b7 |
17 | Child is twin | b0 | b0 |
18 | Child's month of birth | b1 | b1 |
19 | Child's year of birth | b2 | b2 |
20 | Child's date of birth (cmc) | b3 | b3 |
21 | Child's sex | b4 | b4 |
22 | Preceding birth interval | b11 | b11 |
23 | Mother-level variables | ||
24 | Mother's date of birth (cmc) | v011 | v011 |
25 | Highest education level | v106 | v106 |
26 | Current marital status | v501 | v501 |
27 | Partner's education level | v701 | v701 |
28 | Mother's anemia level | ||
29 | Respondent's weight in kg | ||
30 | Respondent's height in cm | ||
31 | Household-level variables | ||
32 | Sex of household head | v151 | v151 |
33 | Age of household head | v152 | v152 |
34 | Wealth index quintile | v190 | v190 |
35 | Social group variables | ||
36 | Religion | ||
37 | Caste | ||
38 | Ethnicity | ||
39 | Language of questionnaire | v045a | |
40 | Language of interview | v045b | |
41 | Native language of respondent | v045c | |
42 | Family-structure variables | ||
43 | Relationship to hh head | hv101 | hv101 |
44 | De jure residents | hv102 | hv102 |
45 | De facto residents | hv103 | hv103 |
46 | Sex of hh member | hv104 | hv104 |
47 | Age of hh member | hv105 | hv105 |
48 | Highest education of hh member | hv106 | hv106 |
49 | Community-level variables | ||
50 | Distance to healthcare facility | ||
51 | Covered by health insurance | v481 | |
52 | Cluster altitude |
Nepal variable checklist
The variable availability for Nepal is given below.
pos | Variable labels | NPDHS 1996 | NPDHS 2001 | NPDHS 2006 | NPDHS 2011 | NPDHS 2016 | NPDHS 2022 |
---|---|---|---|---|---|---|---|
1 | ID and Weight variables | ||||||
2 | Women's ind sample weight | v005 | v005 | v005 | v005 | v005 | v005 |
3 | Primary sampling unit | v021 | v021 | v021 | v021 | v021 | v021 |
4 | Sample stratum number | v022 | v022 | v022 | v022 | v022 | v022 |
5 | Strata used in sample design | v023 | v023 | v023 | v023 | v023 | v023 |
6 | Cluster number | v001 | v001 | v001 | v001 | v001 | v001 |
7 | Household number | v002 | v002 | v002 | v002 | v002 | v002 |
8 | Respondent's line number | v003 | v003 | v003 | v003 | v003 | v003 |
9 | Birth order | bord | bord | bord | bord | bord | bord |
10 | Subnational region variables | ||||||
11 | Region/Division/State | v024 | v024 | v024 | v024 | v024 | v024 |
12 | District of residence | shdistr | shdist | shdistrict | shdist | shdist | |
13 | Ecological region | shez | shreg1 | shreg1 | shecoreg | shecoreg | shecoreg |
14 | Birth history variables | ||||||
15 | Survival status of child | b5 | b5 | b5 | b5 | b5 | b5 |
16 | Child's age at death (cmc) | b7 | b7 | b7 | b7 | b7 | b7 |
17 | Child is twin | b0 | b0 | b0 | b0 | b0 | b0 |
18 | Child's month of birth | b1 | b1 | b1 | b1 | b1 | b1 |
19 | Child's year of birth | b2 | b2 | b2 | b2 | b2 | b2 |
20 | Child's date of birth (cmc) | b3 | b3 | b3 | b3 | b3 | b3 |
21 | Child's sex | b4 | b4 | b4 | b4 | b4 | b4 |
22 | Preceding birth interval | b11 | b11 | b11 | b11 | b11 | b11 |
23 | Mother-level variables | ||||||
24 | Mother's date of birth (cmc) | v011 | v011 | v011 | v011 | v011 | v011 |
25 | Highest education level | v106 | v106 | v106 | v106 | v106 | v106 |
26 | Current marital status | v501 | v501 | v501 | v501 | v501 | v501 |
27 | Partner's education level | v701 | v701 | v701 | v701 | v701 | v701 |
28 | Mother's anemia level | ||||||
29 | Respondent's weight in kg | ||||||
30 | Respondent's height in cm | ||||||
31 | Household-level variables | ||||||
32 | Sex of household head | v151 | v151 | v151 | v151 | v151 | v151 |
33 | Age of household head | v152 | v152 | v152 | v152 | v152 | v152 |
34 | Wealth index quintile | v190 | v190 | v190 | v190 | v190 | v190 |
35 | Social group variables | ||||||
36 | Religion | v130 | v130 | v130 | v130 | v130 | v130 |
37 | Caste | sh020 | sh27 | scaste | |||
38 | Ethnicity | v131 | v131 | v131 | v131 | v131 | v131 |
39 | Language of questionnaire | slangq | slangq | sqlang | slquest | v045a | v045a |
40 | Language of interview | slangi | slangi | silang | slinterv | v045b | v045b |
41 | Native language of respondent | slangn | slangr | snlang | slnative | v045c | v045c |
42 | Family-structure variables | ||||||
43 | Relationship to hh head | hv101 | hv101 | hv101 | hv101 | hv101 | hv101 |
44 | De jure residents | hv102 | hv102 | hv102 | hv102 | hv102 | hv102 |
45 | De facto residents | hv103 | hv103 | hv103 | hv103 | hv103 | hv103 |
46 | Sex of hh member | hv104 | hv104 | hv104 | hv104 | hv104 | hv104 |
47 | Age of hh member | hv105 | hv105 | hv105 | hv105 | hv105 | hv105 |
48 | Highest education of hh member | hv106 | hv106 | hv106 | hv106 | hv106 | hv106 |
49 | Community-level variables | ||||||
50 | Distance to healthcare facility | ||||||
51 | Covered by health insurance | v481 | |||||
52 | Cluster altitude | v040 | v040 | v040 | v040 |
NOTE: Regarding the variable names and their corresponding datasets
- In all Nepal DHS the district variable is available in the IR, HR and PR files. Here we quote the names from the HR and PR files.
- In all Nepal DHS the ecological region variable is available in the IR, HR and PR files. Here we quote the names from the HR and PR files.
- In all Nepal DHS rounds the ethnicity variable is similar to caste.
- The caste variable is available in the HR and PR files of 1996 and 2001 rounds and in the BR and IR files of 2006 round.
- The language-related variables are in the IR and BR files.
- In Nepal DHS 1996 and 2001 the wealth index variable is available in a separate file.
Pakistan variable checklist
The variable availability for Pakistan is given below.
pos | Variable labels | PKDHS 1990 | PKDHS 2006 | PKDHS 2012 | PKDHS 2017 |
---|---|---|---|---|---|
1 | ID and Weight variables | ||||
2 | Women's ind sample weight | v005 | v005 | v005 | v005 |
3 | Primary sampling unit | v021 | v021 | v021 | v021 |
4 | Sample stratum number | v022 | v022 | v022 | v022 |
5 | Strata used in sample design | v023 | v023 | v023 | v023 |
6 | Cluster number | v001 | v001 | v001 | v001 |
7 | Household number | v002 | v002 | v002 | v002 |
8 | Respondent's line number | v003 | v003 | v003 | v003 |
9 | Birth order | bord | bord | bord | bord |
10 | Subnational region variables | ||||
11 | Region/Division/State | v024 | v024 | v024 | v024 |
12 | District of residence | sdist | sdist | sdist | sdist |
13 | Ecological region | ||||
14 | Birth history variables | ||||
15 | Survival status of child | b5 | b5 | b5 | b5 |
16 | Child's age at death (cmc) | b7 | b7 | b7 | b7 |
17 | Child is twin | b0 | b0 | b0 | b0 |
18 | Child's month of birth | b1 | b1 | b1 | b1 |
19 | Child's year of birth | b2 | b2 | b2 | b2 |
20 | Child's date of birth (cmc) | b3 | b3 | b3 | b3 |
21 | Child's sex | b4 | b4 | b4 | b4 |
22 | Preceding birth interval | b11 | b11 | b11 | b11 |
23 | Mother-level variables | ||||
24 | Mother's date of birth (cmc) | v011 | v011 | v011 | v011 |
25 | Highest education level | v106 | v106 | v106 | v106 |
26 | Current marital status | v501 | v501 | v501 | v501 |
27 | Partner's education level | v701 | v701 | v701 | v701 |
28 | Mother's anemia level | ||||
29 | Respondent's weight in kg | ||||
30 | Respondent's height in cm | ||||
31 | Household-level variables | ||||
32 | Sex of household head | v151 | v151 | v151 | v151 |
33 | Age of household head | v152 | v152 | v152 | v152 |
34 | Wealth index quintile | v190 | v190 | v190 | v190 |
35 | Social group variables | ||||
36 | Religion | ||||
37 | Caste | ||||
38 | Ethnicity | v131 | |||
39 | Language of questionnaire | slang1 | slangq | slangq | v045a |
40 | Language of interview | slang2 | slangi | slangi | v045b |
41 | Native language of respondent | slang3 | slangw | slangr | v045c |
42 | Family-structure variables | ||||
43 | Relationship to hh head | hv101 | hv101 | hv101 | hv101 |
44 | De jure residents | hv102 | hv102 | hv102 | hv102 |
45 | De facto residents | hv103 | hv103 | hv103 | hv103 |
46 | Sex of hh member | hv104 | hv104 | hv104 | hv104 |
47 | Age of hh member | hv105 | hv105 | hv105 | hv105 |
48 | Highest education of hh member | hv106 | hv106 | hv106 | hv106 |
49 | Community-level variables | ||||
50 | Distance to healthcare facility | ||||
51 | Covered by health insurance | v481 | |||
52 | Cluster altitude |
START FROM HERE
Majority of variables have been compared and checked. Need to check for tobacco consumption and pregnancy calendar variables.