Learning Outcomes
1.To be aware of the main institutions operating at national and international level and their data sources
(e.g. Eurostat, ECB, IMF, ILO, BIS, UN, OECD, World Bank).
2.To understand and be able to use different kinds of data sources such as censuses, sample surveys – cross section, longitudinal –, administrative sources, big data.
3.To be able to understand methodological issues related to some specific fields of official statistics and to interpret correctly official statistics.
4.To be able to apply methods suitable to produce and analyse data in the specific field.
5.Knowledge of and ability to apply special statistical methods such as sampling methods, non-response adjustments and imputation.
6.Ability to use statistical computer programmes such as R or SPSS.
7.Ability to present data in an effective way to different kinds of audience.
8.To be aware of the different tools available for data and metadata dissemination and presentation of results (tables, charts in a static and dynamic web-based environment
Course Content (Syllabus)
Basic concepts in National Accounts. Basic concepts in Demography, definitions, research questions and framework of demography and population research. Introduction to techniques of demographic analysis; elements of period and cohort analysis; demographic rates, ratios and measures. Sources of data (census, vital statistics, sample surveys) and errors in demographic data. Population change and structure; analysis of migration data. Population pyramids, factors affecting age distribution of the population; socioeconomic and demographic implications of population aging. Population predictions and projections. Basic concepts, modern theory and practice of index numbers as a means of comparing prices and quantities.
Downloading data from European Demographic Databases (Big Data). Tidying data, manipulation of nonresponders and missing values using weights and multiple imputation methods, calculation of estimators, statistical analysis and hypothesis testing in sample survey data. The module “Complex Samples Procedures” of the IBM Statistics SPSS software and the libraries tidyr, stringr, lubridate, hmisc, mice, survey, nlme, lme4, dplyr, shiny, surveyoutliers, tabplot, survey, eurostat, rcmdrPlugin.sampling, cbsodataR, censusapi of the statistical software R in the RStudio environment. Presentation - Visualization of data and statistical results with the ggplot2 library.
Keywords
Big Data, Tidying data, nonresponders, missing values, multiple imputation, survey data, eurostat, R, RStudio, IBM Statistics SPSS
Additional bibliography for study
1. Särndal, C., B. Swensson, and J. Wretman. 1992. Model Assisted Survey Sampling. New York: Springer-Verlag.
2. Lumley, T. 2010. Complex Surveys A Guide to Analysis Using R. John Wiley & Sons, Inc., Hoboken, New Jersey.
3. IBM SPSS Complex Samples 21. IBM Corporation, 1989 – 2012.
4. Rubin B. D. 1987. Multiple Imputation for Nonresponse in Surveys. John Wiley & Sons, Inc., Hoboken, New Jersey.
5. Stef van Buuren . 2012. Flexible Imputation of Missing Data. CRC Press,Taylor & Francis Group, Boca Raton.
6. Särndal, Lundström. 2005. Estimation in Surveys with Nonresponse. John Wiley & Sons Ltd, West Sussex, UK.
7. Bethlehem J., Cobben F., Schouten B. 2011. Handbook of Nonresponse in Household Surveys. John Wiley & Sons, Inc., Hoboken, New Jersey.