Skip to content

Commit

Permalink
Copy files from Overleaf to Github Repo
Browse files Browse the repository at this point in the history
  • Loading branch information
ppdewolf authored Aug 26, 2024
1 parent eeee40d commit 0cb0618
Show file tree
Hide file tree
Showing 13 changed files with 1,003 additions and 0 deletions.
Binary file added by-sa.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
28 changes: 28 additions & 0 deletions chapters/0-Introduction.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
\chapter{Introduction}\label{ch:intro}
\chaptermark{Introduction} % Short version of chapter name for the page headers

At least once every ten years all countries in the world are recommended by the UN to conduct a Population and Housing Census. In the European Union all member states have to conduct such a census in years ending on a ‘1’. Historically, censuses were the only source of information about the number of people residing in a country. A number of countries used these censuses to set up population registers. Over time both the censuses and population registers contain more and more variables.


Many countries still conduct so-called traditional censuses where all information is collected via field work. However, increasingly population registers are used as a backbone for population statistics: the people who are in the population register should be enumerated. Countries with that approach have over time moved from a de facto census to a de jure census. These countries typically conduct also their demographic statistics on their population registers so that consistency between the different population statistics can be achieved. In some countries traditional censuses with field work to enumerate the population do not exist any longer as they are replaced by so-called register-based censuses. These censuses are based on information from population registers combined with information from other administrative sources to which the statistical office has access as well. In other countries combined censuses are conducted. Then some variables are taken from registers and other variables are collected via field work. We can thus conclude that in all the countries that conduct a register-based or combined census population registers play an important role when a population and housing census is conducted. These population registers are then often also used for demographic statistics.


Over time both census and demographic statistics have become more detailed. Among other things this leads to many small cell values and possible disclosures in tabular outputs. Therefore, both kinds of statistics have to be protected against disclosure of individual information. Traditionally, conventional rounding and suppressing cell values are used to protect the information. However, if the aim is to publish detailed information rounding leads to information loss and the traditional protection technique of suppressing unsafe cells leads to high numbers of suppressed cells in tables.


In the European Census 2011 many different techniques were applied by the European Union member states. Although this helped to stay within legal country frameworks, the comparability between country tables was hampered severely. This led to discussions how to improve the situation for users of these tables. Solutions should not only take into account the direct risks of disclosure, but also the risk of disclosure by differencing. Special attention is needed for grid squares (of 1 km x 1 km) tables according to national and European grid definitions. After joint projects of several European countries in a so-called Framework Partnership Agreement (FPA), in the European Statistical System (ESS) it was recommended to make use of Targeted Record Swapping (TRS) and the Cell Key Method (CKM) for Statistical Disclosure Control (SDC) of tabular output of the European Census 2021. Most countries applied at least one of these methods (see \cite{censusmethodsUNECE}).


It is clear that using the same SDC methods will lead to better comparable data. Moreover, with the recommended perturbative methods many more cells will be published than in the previous Census Round. With TRS some large differences may appear, e.g. in case large households of unequal size are swapped. However, the percentage of households to swap is normally low so that this effect is moderate. With CKM there may be small absolute differences between true values and published values in table cells. However, relative differences between true values and published values are quite small for the not too small true values.

Before the use of TRS and/or CKM to protect census tables, the outcomes of the less detailed demographic statistics and the more detailed census statistics were consistent for those countries that used the same sources and same reference dates for both types of statistics. Adding noise to census tables only and not to the demographic tables obviously may lead to small inconsistencies. Especially in case both statistics refer to the same reference day this may be confusing for users.
This was indeed observed in some countries with the Census 2021. In the future a solution has to be found for that issue. Note however that, because of the randomness of the methods, using TRS and/or CKM may still lead to small inconsistencies when applied in a non-harmonised way.
As at the European level under the planned future ESOP (European Statistics on Population) regulation these statistics will be brought together, it is clear that plans have to be made how to protect the ESOP output. At the national level the ongoing modernisation in many countries leads to the same need for harmonised plans to protect the output of population statistics.


Although the precise ESOP set of tables and their publication frequencies have not yet been decided, it is already known that more frequent publications than in a census context will appear and demographic statistics will become more detailed. Consistent SDC methods have to be included in the future ESOP production processes. The research done in recent years and experiences of the European Census 2021 will be of great help to work further and decide on the SDC methods for ESOP. This document can be considered as a start of this work. Moreover, this work could in the future be of help to protect more European integrated population statistics of varying detail and at different periodicities.


Given the large number of SDC methods, it is difficult to cover them all in the necessary depth in these guidelines. For this reason, we concentrate on what we consider to be the most relevant. However, we can also recommend the `Guidelines for SDC methods applied on Geo-Referenced Data' \cite{Guidelines1_GeoGL} to the interested reader, in which methods more targeted at cartographic publications, such as the Quadtree Method or Spatial Smoothing, are dealt with.

The current guidelines are intended to be of help for statisticians in different countries who need to protect detailed census and demographic outputs. In this report an overview of SDC methods for census and demographic tables is given in Chapter~\ref{ch:sdcmeth}. A number of current methods is discussed as well as software to apply the methodology. Also a comparison of these SDC methods is made. In Chapter~\ref{ch:consistency} consistency and disclosure risk issues are presented. The difficult and country specific issue of defining parameters when using SDC methods for census and demographic tables is handled in Chapter~\ref{ch:params}. Finally, communication of SDC methods to data users is the topic of Chapter~\ref{ch:comm}. Communication examples of several countries are also presented in this chapter.
44 changes: 44 additions & 0 deletions chapters/00_frontpage.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@

\pagenumbering{gobble}

\begin{center}

\sffamily
\bfseries
{\Large WP2 of STACE project}\bigskip

{\Large Grant agreement 899218 – 2019-BG-Methodology}\vspace{72pt}

{\Large Date}\bigskip

\mdseries
{\large August 31, 2024}\vspace{36pt}

\bfseries
{\Large Task T2.4}\bigskip

\mdseries
{\large Guidelines}\vspace{36pt}

\bfseries
{\Large Deliverable D2.11}\bigskip

\mdseries
{\large Guidelines for SDC Methods for Census and Demographics Data}\vspace{72pt}

\bfseries
{\Large Sensitivity}\bigskip

\mdseries
{\large Available to general public}\vspace{90pt}
\rmfamily

\includegraphics{eu_funded.png}

\end{center}


\newpage


\pagenumbering{arabic}
6 changes: 6 additions & 0 deletions chapters/1-Disclosure_Risks.tex
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
\chapter{Disclosure Risks in Census and Demographic Tables}\label{ch:discrisk}
\chaptermark{Disclosure Risks} % Short version of chapter name for the page headers


\section{Section Title}

Loading

0 comments on commit 0cb0618

Please sign in to comment.