Skip to content

Latest commit

 

History

History
817 lines (518 loc) · 31 KB

firefly-trust-sync.md

File metadata and controls

817 lines (518 loc) · 31 KB

Firefly Trust Sync

By Tom Marble

Abstract

Introducing the Firefly Trust Sync (Firefly) architecture as a decentralized, web-of-trust alternative to address the shortcomings of the Certificate Authority (CA) based Public Key Infrastructure (CA-based PKI) and the Pretty Good Privacy (PGP) web-of-trust. Self sovereign identity is a cornerstone of this architecture and yet it does not rely whatsoever on distributed ledger technology. Essential design elements are presented with initial thoughts on both advantages and disadvantages of this approach as well as some next steps.

  1. Abstract
  2. Introduction
    1. Problems
    2. Goals
  3. Design
    1. Extending SQRL for peer to peer authentication
    2. Bootstrapping with DNS and TLS
    3. A distributed graph database
    4. Anonymity by default
    5. Optional Pet Names
    6. Onion query obfuscation
  4. Use Cases
    1. Invitation to Firefly
    2. Identity linking
    3. Verify web site certificates
    4. Verify ssh server fingerprints
    5. Send Secure e-mail
    6. Trust revocation
  5. Advantages
    1. Distributed secure database
    2. Energy conservation
    3. There is no cabal
    4. No key escrow
    5. Network tolerant of identity server failures
    6. Anti-spam provision
    7. Anonymity by default
  6. Disadvantages
    1. A server for each identity
    2. UX Challenge
  7. Next Steps
    1. Immutability and GDPR Compliance
    2. Open Source Prototype
    3. Performance Analysis
    4. Auditing and reproducible builds
    5. Threat Assessments
    6. Future

Introduction

The success of the Internet has been based on the independence and asynchrony of participants growing organically. The Domain Name Service (DNS1) is one of the essential Internet services that functions as a federation of decentralized servers which has tracked the growth of the Internet itself. In first wave of e-commerce a solution was needed for consumers to safely transmit their credit card information to merchants and that drove the development of Secure Sockets Layer (SSL) – later Transport Layer Security (TLS) – and the CA-based PKI. The PGP Web of Trust is an effort to leverage public key cryptography for a slightly more generalized model of trust. And yet all of these technologies are succumbing to growing pains or outright attacks.

The name Firefly evokes one of many biomimetic lessons we may discover in nature2.

Problems

The Firefly architecture proposes to address these problems:

The CA-based PKI is fragile

The primary problem with the CA-based PKI is the prevalence of Man in the Middle attacks (MiTM3). There have been a series of band-aid measures like Extended Validation certificates (now seen as a failure4) and Certificate Transparency5 yet the audits can't scale (directly), it has time windows of vulnerability, it may leak browsing information and it doesn't have a standard user experience (UX) in case of auditing failure6.

The PGP Web of Trust has problems

The network of automatically synchronized public key servers7 has worked remarkably well despite some unfortunate design choices that the identities (UID's which are often e-mail addresses) are completely exposed and anyone can attach certifications to a key. Largely due to UX failures the convenience of using short ID's led to the Evil-32 attack8. And to make matters worse now we are now seeing certificate flooding9.

Failure of secure e-mail

Despite the availability of core technologies to send PGP encrypted e-mail it simply has failed due to incredible complexity and absolute lack of UX design. In it's place are a myriad of ad-hoc solutions – often involving some poor messaging implementation on a website which is "secured" by TLS10.

Goals

Firefly architecture aspires to the following goals:

An additional TLS Certificate check

Validating TLS certificates in Firefly could provide additional assurance to Certificate Transparency Auditing and/or offer an alternative verification path for certificates that do not participate in the CA-based PKI.

An alternative public key verification for identities

Perhaps the most common use of the PGP Web of Trust is to verify the public key of an e-mail correspondent with whom we would like to share secure messages. Firefly could replicate this functionality without some of the design shortcomings of PGP network while providing and countermeasures for known attacks.

Avoid the environmental catastrophe of the blockchain

While it is fashionable for self-sovereign digital identity solutions to rely on distributed ledger technology11 Firefly offers an alternative which will not exacerbate climate change12.

Design

The core principle of Firefly is that each identity owner shall have exclusive control of their private key material. The impact of this design choice is that if the proposed pre-planned key backup and recovery procedures are not followed there is literally no one to call – the identity will be lost. And yet this is the ultimate way to ensure against abuse by third parties – including services offering convenience via key material management or escrow. This creates an additional incentive to take proper care of key material.

Extending SQRL for peer to peer authentication

Software implementations of Firefly will be built "on the shoulders of giants" as is most software. Core key management will be handled with Steve Gibson's public domain Secure Quick Reliable Login (SQRL13) technology. The essential concepts leveraged from SQRL include:

An unique public key pair for every identity relationship

SQRL is designed around a master password based Password Based Key Derivation Function (PBKDF) which generates an Ed25519 Elliptic curve public key pair14 for every website (identity client) requiring authentication.

This leads to one (of many) SQRL advantages in that disparate websites cannot collude to merge identities as the public key for each site is unique. And it obviates using a password manager as the master password is all that is needed to access the keypair for a given client.

A concise, offline, paper based identity backup solution

The master SQRL identity information can be represented in a few hundred bytes and is designed to printed on paper as QR Code for ease of machine scanning (which is useful in the case of sharing an identity between multiple devices). The offline backup is encrypted using a 24-digit decimal Rescue Code which, obviously, should be stored separately.

Firefly extensions involve using SQRL deliberately for machine-to-machine communication (in addition to human user to machine communication) as well as supporting a superset of the SQRL API for web-of-trust operations.

Bootstrapping with DNS and TLS

Any alternate web-of-trust solution could not be "self hosted" initially and thus Firefly will need to begin (at least) by relying on existing DNS and CA-based PKI solutions. Even if self hosting becomes feasible it makes much more sense to use the existing DNS and CA-based PKI as additional inputs to the trust heuristic (i.e. the answers may provide fascinating telemetry that might suggest that a MiTM is being attempted, for example).

To the degree there is concern about DNS accuracy and/or privacy an interim solution may involve using DNS over HTTPS (DOH15).

Fortunately it is quite easy now to automate creation of Let's Encrypt16 TLS certificates.

A distributed graph database

Each Firefly identity node maintains a set of trust assertions that may be queried. Unlike the Secure Keyserver network there is no attempt to constantly replicate the entirety of the trust database everywhere. Based on the the type of the query it may be delegated to peers and coalesce responses for for the caller. In this way the query function will move to where the data is located (not vice versa). As the web-of-trust is inherently a social graph it follows that the data should be organized as a graph database17. And recursive queries in a graph database can have a significant performance advantage in a graph database vs. a relational database18.

As each identify requires a server the Firefly architecture begins at the maximum possible level of distribution. Just as a well known public key has multiple signatures a well known identity in the Firefly network will have trust assertions on multiple Firefly servers. In this way the network is resilient against node failures.

In creating a query framework intended to gather multiple data points as input to trust heuristics Firefly can weigh the importance of multiple factors including: social proximity, number of positive assertions, number of revocations, age of as assertions, input from classic DNS and/or CA-based PKI or the PGP Web of Trust. Each heuristic may require differing dimensions of consensus depending on the sensitivity of the decision. In the web-of-trust we don't expect the simplicity of "one" answer and yet the signal of consensus will outweigh the noise when an assertion is trustworthy.

Fortunately there is a wealth of resources in supporting graph database design. By adding time to the traditional entity–attribute–value model19 it is possible to enable queries over time (and presents an opportunity to take advantage of immutability20). The EAVT schema may be more like the Resource Description Framework (RDF21) Schema22 or more like other popular EAVT schemas23.

The implementation of the query language ("Firefly Script") takes inspiration from Datalog24 and SPARQL25. Especially apropos for running queries across the Firefly network is the SPARQL Federated Query26 standard and implementations27.

Anonymity by default

As each Firefly public key is scoped to the context of a given identity server (as inherited from SQRL) each assertion is, by default, anonymous. Each Firefly user can decide the degree to which their public key on any node may be associated with other identities and/or metadata. In this way a public key may be associated with a traditional UID (e.g. e-mail address, service URL) or remain opaque.

Optional Pet Names

As noted in the paper on Pet Names28 "Zooko's Triangle29 tells us that names can have two out of three properties: decentralized, globally unique, human meaningful."

The domain name of a Firefly server (the scope) and a user public key within that server acts as a globally unique identifier. By design it is decentralized, but alas not human meaningful.

Thus one type of trust assertion in Firefly is the association of a Pet Name to Firefly identity giving us hope for human readability and verification.

Onion query obfuscation

In some cases it may be desirable for Firefly queries to identify their provenance (the requesting server), but this is not strictly required. By leveraging work on federated queries and existing Internet Protocol algorithms (e.g. Time-To-Live) one can imagine constructing a layered approach to querying. At each hop in the query a Firefly server is only aware of the identity of the client and does not need to reveal the peers to which it will forward the query. Given a minimum number of hops this may provide TOR30 like query anonymity.

Use Cases

The following use cases are examples of how the Firefly Trust Sync architecture will operate:

Invitation to Firefly

As with many distributed networks participation starts with with an invitation. Firefly's invitation process is inspired by, but hopefully more simple than that of Secure Scuttlebutt31.

To explain how Alice will invite Bob to her network these mnemonics will be used:

The invitation steps are:

  1. Alice sends Bob an invite code (in the EAVT schema): AD Ak nonce-signed-by-AK
  2. Bob verifies that nonce-signed-by-AK matches Ak
  3. Bob replies by authenticating to AD thus creating BK*/*Bk
  4. Bob sends a message to Alice: BD Bk nonce-signed-by-BK
  5. Alice verifies that nonce-signed-by-BK matches Bk
  6. Alice (AD : Ak) is now connected to Bob (BD : Bk)

Identity linking

On Alice's Firefly identity server she wants to assert that she trusts Bob's identity by entering a fact in the graph database:

  1. Alice inserts an assertion (this example inspired by RDF where aid is an unique assertion id):
    • aid, wot:truster, Ak
    • aid, wot:id, Bk
    • aid, wot:sig, AK-signs-Bk

Verify web site certificates

We would like to use Firefly as verification for web site certificates.

Now adding the following mnemonics:

  • W : Website URL
  • Wd : The TLS certificate digest for W

Bob wants to publish an assertion that he trusts a given website's certificate:

  1. Bob inserts an assertion (on BD)
    • aid, wot:truster, Bk
    • aid, wot:web, W
    • aid, wot:webcert, Wd
    • aid, wot:sig, BK-signs-W-Wd

Given this information in the Firefly database we can propose an extension to Monkeysphere32 to query Firefly in addition to (or as a replacement to) the PGP Web of trust.

Verify ssh server fingerprints

In the same way that Monkeysphere can improve the security and user experience for both ssh server administrators and users33 we could propose extensions to Monkeysphere for server identities.

Here the assertion entered into the Firefly database is analogous to that for a web server above (and either may allow specifying a non-standard port).

Send Secure e-mail

By selecting an e-mail as a Pet Name and making trust assertions about it two (or more) correspondents may trust they have the proper public with which to encrypt secure e-mail for each other.

If Alice wants to associate the e-mail [email protected] with BD : Bk she can insert this fact in the database:

  1. Alice trusts Bob's e-mail
    • aid, wot:truster, Ak
    • aid, wot:firefly, BD
    • aid, wot:id, Bk
    • aid, wot:email, [email protected]
    • aid, wot:sig, AK-signs-BD-Bk-email

Alice knows, then, she can encrypt e-mail for Bob, and (through analogous steps) Bob knows he can encrypt e-mail for Alice [email protected] with AD : Ak.

At this point the secure mail exchange approach of Autocrypt34 could be used for Alice and Bob to communicate securely.

Trust revocation

As the database records a timestamp with every entry (the T in EAVT) revoking a previous trust assertion is as easy as inserting a new fact with the attribute wot:revoke in assertions where wot:sig had been used.

Advantages

Here are some advantages of the Firefly Trust Sync architecture:

Distributed secure database

Considering Firefly as a distributed, secure database can see the immense benefits over using the blockchain as a database35.

Would you use a database with these features?

  • Uses approximately the same amount of electricity as could power an average American household for a day per transaction
  • Supports 3 transactions / second across a global network with millions of CPUs/purpose-built ASICs
  • Takes over 10 minutes to "commit" a transaction
  • Doesn’t acknowledge accepted writes: requires you read your writes, but at any given time you may be on a blockchain fork, meaning your write might not actually make it into the “winning” fork of the blockchain (and no, just making it into the mempool doesn’t count). In other words: “blockchain technology” cannot by definition tell you if a given write is ever accepted/committed except by reading it out of the blockchain itself (and even then)
  • Can only be used as a transaction ledger denominated in a single currency, or to store/timestamp a maximum of 80 bytes per transaction

But it’s decentralized!

Energy conservation

In making trust assertions by writing to a distributed graph database Firefly is much more energy efficient that having multiple computers sweat to brute force a one way function36:

By late next year, bitcoin could be consuming more electricity than all the world’s solar panels currently produce – about 1.8 percent of global electricity. That would effectively erase decades of progress on renewable energy.

Given the urgency of global climate change we cannot, in good conscience, advocate an alternative web-of-trust solution which is on track to consume 32 Terawatthours (TWh) of energy37.

There is no cabal

Some might advocate that "proof of stake"38 blockchain technology won't suffer from some of the problems of "proof of work" technology39.

And yet, in practice, it would seem that there is still a cabal of minters that have asymmetric community power to create Initial Coin Offerings (ICO's) and reap the benefits of the droit du Seigniorage40.

In Firefly there is no dependence on any one specific, occult blockchain network.

No key escrow

Firefly users are in complete control of their private key material and do not need to trust any third party "escrow" service.

Network tolerant of identity server failures

As each identity is represented by a Firefly node (maximal distribution) and widely held trust assertions are dispersed across the network it is tolerant of temporary identity server failure.

Anti-spam provision

As only trusted identities can make assertions the dispersion of naively "astroturfing" facts is limited.

Anonymity by default

Firefly makes Pet Name style associations on a voluntary basis by starting at a default of anonymity.

Disadvantages

Here are some disadvantages of the Firefly Trust Sync architecture that need further consideration:

A server for each identity

While each Firefly identity does not require a physical "server" per se it does require it's own domain name, TLS certificate, and web service API implementation.

The cost of asserting identities in Firefly is much higher than in the current PGP Web of trust. While this may have a benefit of discouraging the propagation of "noise", it is yet a barrier to adoption.

UX Challenge

One of the failures of the PGP Web of Trust was the utter lack of user experience design. While the building blocks of SQRL, Monkeysphere and Autocrypt have made great progress in addressing parts of the UX challenge, Firefly will need comprehensive UX design to successful

Next Steps

Further discussion and analysis of the Firefly Trust Sync architecture is welcome. Here are a few important next steps to consider:

Immutability and GDPR Compliance

There is a certain engineering elegance to an immutable database and "append only" logs have significant benefits from a security auditing point of view. But what do we do when the "nuclear launch codes" get accidentally added to the database. Or what about very offensive speech?

An early question to resolve is to determine if there's any way to get the security (and other benefits) of an immutable database41 and yet be in compliance with Europe's General Data Protection Regulation (GDPR42)?

Open Source Prototype

There's nothing like working code to demonstrate that an architecture design is plausible. An essential next step will be to build open source prototypes of Firefly, including necessary extensions to SQRL, Monkeysphere and Autocrypt.

Performance Analysis

Given a prototype Firefly implementation it will be possible to test common operations under synthetic load to determine likely performance characteristics.

Auditing and reproducible builds

Throughout the discussion of Firefly the focus has been about trusting the data in the graph database. Clearly Firefly needs to enable users to trust the software that implements the system as well.

There may be an opportunity to make trust assertions about the reproducibility43 of Firefly software such that users chain trust the "entire chain of custody" of a trust assertion (including the software).

Threat Assessments

A basic threat assessment must be done to elaborate expected attacks, such as denial of service.

For each of these threats a mitigation or countermeasure should be proposed.

Future

Given a basic implementation of Firefly there are some interesting future possibilities to consider:

  • Could Firefly help improve the trust of Internet of Things (IoT) devices?
  • Could small IoT devices be represented by a Firefly proxy?
  • Would it be possible to support TOR-like anonymous queries?
  • Could Firefly become the basis for a generalized trust API44?

Footnotes

1 Domain Name System
https://en.wikipedia.org/wiki/Domain_Name_System

2 SYNC: The Emerging Science of Spontaneous Order, by Steven H. Strogatz
http://www.stevenstrogatz.com/books/sync-the-emerging-science-of-spontaneous-order

3 New Research Suggests That Governments May Fake SSL Certificates
https://www.eff.org/deeplinks/2010/03/researchers-reveal-likelihood-governments-fake-ssl

4 Extended Validation Certificates are Dead
https://www.troyhunt.com/extended-validation-certificates-are-dead/

5 Certificate Transparency Version 2.0
https://tools.ietf.org/id/draft-ietf-trans-rfc6962-bis-30.html

6 How will Certificate Transparency Logs be Audited in Practice?
https://www.agwa.name/blog/post/how_will_certificate_transparency_logs_be_audited_in_practice

7 Secure Keyserver Network
https://bitbucket.org/skskeyserver/sks-keyserver/wiki/Home

8 Evil-32
https://evil32.com/

9 Community Impact of OpenPGP Certificate Flooding
https://dkg.fifthhorseman.net/blog/community-impact-openpgp-cert-flooding.html

10 Modern Alternatives to PGP
https://blog.gtank.cc/modern-alternatives-to-pgp/

11 Decentralized Identifiers (DIDs)
https://w3c-ccg.github.io/did-spec/#introduction

12 Bitcoin’s energy usage is huge – we can't afford to ignore it
https://www.theguardian.com/technology/2018/jan/17/bitcoin-electricity-usage-huge-climate-cryptocurrency

13 Secure Quick Reliable Login
https://www.grc.com/sqrl/sqrl.htm

14 SQRL Detailed Cryptographic Design
https://www.grc.com/sqrl/crypto.htm

15 DOH: DNS over HTTPS has mitigation
https://en.wikipedia.org/wiki/DNS_over_HTTPS

16 Let's Encrypt
https://letsencrypt.org/

17 Graph Database
https://en.wikipedia.org/wiki/Graph_database

18 Neo4j is faster than MySQL in performing recursive query
https://maxdemarzi.com/2017/02/06/neo4j-is-faster-than-mysql-in-performing-recursive-query/

19 Entity–attribute–value model
https://en.wikipedia.org/wiki/Entity%E2%80%93attribute%E2%80%93value_model

20 The rise of immutable data stores
https://usblogs.pwc.com/emerging-technology/the-rise-of-immutable-data-stores/

21 Resource Description Framework
https://en.wikipedia.org/wiki/Resource_Description_Framework

22 RDF Schema
https://en.wikipedia.org/wiki/RDF_Schema

23 Unofficial guide to Datomic internals
https://tonsky.me/blog/unofficial-guide-to-datomic-internals/

24 Datalog
https://en.wikipedia.org/wiki/Datalog

25 SPARQL
https://en.wikipedia.org/wiki/SPARQL

26 SPARQL 1.1 Federated Query
https://www.w3.org/TR/sparql11-federated-query/

27 Blazegraph Federated Query
https://wiki.blazegraph.com/wiki/index.php/FederatedQuery

28 Pet Names
https://github.com/cwebber/rebooting-the-web-of-trust-spring2018/blob/petnames/draft-documents/making-dids-invisible-with-petnames.md

29 Zooko's Triangle
https://en.wikipedia.org/wiki/Zooko%27s_triangle

30 TOR
https://www.torproject.org/

31 Secure Scuttlebutt
https://ssbc.github.io/scuttlebutt-protocol-guide/

32 Monkeysphere
http://monkeysphere.info/

33 Monkeysphere for SSH
http://monkeysphere.info/getting-started-ssh/

34 Autocrypt
https://autocrypt.org/

35 On the dangers of a blockchain monoculture
http://tonyarcieri.com/on-the-dangers-of-a-blockchain-monoculture

36 New Study of Bitcoin’s Energy Use Makes You Libertarian Nerds Look Even Worse Than Usual
https://www.motherjones.com/environment/2018/05/new-study-of-bitcoins-energy-use-makes-you-libertarian-nerds-look-even-worse-than-usual/

37 The Environmental Case Against Bitcoin
https://newrepublic.com/article/146099/environmental-case-bitcoin

38 Avalanche (AVA) — Blockchain 3.0: A Novel Metastable Consensus Protocol
https://hackernoon.com/avalanche-ava-blockchain-3-0-a-novel-metastable-consensus-protocol-28cdc4ee8984

39 Scalable and Probabilistic Leaderless BFT Consensus through Metastability
https://arxiv.org/abs/1906.08936

40 Seigniorage
https://en.wikipedia.org/wiki/Seigniorage

41 Append-only databases and the GDPR conundrum
https://www.bloorresearch.com/2018/02/append-databases-gdpr-conundrum/?cn-reloaded=1

42 GDPR: What your company should know and do, starting now
https://medium.com/wattx-stories/gdpr-what-your-company-should-know-and-do-starting-now-f62d70f72d7e

43 Reproducible Builds
https://reproducible-builds.org/

44 Fixing Trust on the Internet
https://libreplanet.org/2017/program/#day-2-timeslot-14-session-2-collapse

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.