-
Notifications
You must be signed in to change notification settings - Fork 0
/
Tech_stuff.tex
19 lines (10 loc) · 3.94 KB
/
Tech_stuff.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
% !TEX root = GovChain.tex
Finally, we would like to discuss some minor issues that need to be addressed when deploying a platform for publishing government's data in an open and transparent manner.
First, the solution we propose might make it somewhat difficult to track a particular document. That is, if one simply has a document at hand, and no additional meta data, verifying where this document comes from, in which Merkle tree can we find its hash, or whether this is the latest version of the document might be difficult. A practical solution to this is that the API for downloading particular documents published by the government also includes the meta information needed to find the security certificates for a particular document. Furthermore, to track the latest version of the document, we can now turn either to the government's API or to the monitoring agencies, as described in Section \ref{sec:updates}.
%That is, when downloading each document, one would also obtain the root of its Merkle tree, together with the witnessing path of the inclusion in this Merkle tree, plus the tree's address in the Ethereum blockchain. With this information we can now easily check that the document we have was included in the official data published by the government, plus find a proof of this on Ethereum's blokchain. Furthermore, to track the latest version of the document, we can now turn either to the government's API or to the monitoring agencies, as described in Section \ref{sec:updates}.
%Another issue we face when dealing with government records is that they are usually not stored in standardized formats, and most likely use spreadsheet software such as Microsoft Excel, or text processors such as Microsoft Word. One issue with these tools is that they will change the document's hash value every time they open a document, even when no visible changes have been made (i.e. no new character has been added/removed), mostly due to the meta data they store along with the document. A similar issue can occur based on which operating system is used when processing the document. In both cases, a person who downloaded the document might compute its hash, and obtain a value that is different that that when a hash of the downloaded document was computed. To resolve this, one needs to have a clearly specified format in which the government's documents will be published, and inform the people using it of potential issues. Some solutions here include publishing data in binary format, or as pdf.
Another issue we face when dealing with government records is that they are usually not stored in standardized formats, and most likely use spreadsheet or text processing software. One worth-looking detail is that the encoding of widely adopted formats include data that changes every time the document is opened or saved, thus changing the hash value even when no user-inflicted changes have been made. This is clearly an obstacle when downloading documents and verifying Merkle roots. To adress this, one needs to have a clearly specified format in which the government's documents will be published, and inform the people using it of potential issues. %Some solutions include publishing data in binary format, or as pdf.
Also, care has to be taken with specifics of hash functions being used. For instance, it is well-known that most hash functions (such as the SHA family) process inputs by blocks and are thus vulnerable to length extension attacks. This can be addressed with extra computations such as using $h(x||h(x))$ instead of only $h(x)$. Alternatively, one can opt for other hash functions that resist such attacks \cite{keccak}.
Overall, these illustrate some of the specific issues one may face when designing a platform for transparent publishing of government's data.
% \francisco{Microsoft attacks, One millisecond attacks, Length extension attacks}
% \francisco{Also the programmed updates attack (when documents are predictable and hash lists do not include outer data). Is it harmful?}