update

yzr95924 · May 31, 2021 · 4531dcd · 4531dcd
1 parent e0e9e54
commit 4531dcd
Show file tree

Hide file tree

Showing 5 changed files with 251 additions and 4 deletions.
diff --git a/README.md b/README.md
@@ -7,7 +7,7 @@ In this repo, it records some paper related to storage system, including **Data
 
 | Type                    | Paper Amount |
 | ----------------------- | ------------ |
-| A. Data Deduplication   | 116          |
+| A. Data Deduplication   | 83           |
 | B. Erasure Coding       | 37           |
 | C. Security and Privacy | 12           |
 | D. Other                | 7            |
@@ -97,6 +97,8 @@ In this repo, it records some paper related to storage system, including **Data
 25. *Privacy-Preserving Data Deduplication on Trusted Processors*----CLOUD'17 ([link](https://ieeexplore.ieee.org/document/8030573)) [summary]( https://yzr95924.github.io/paper_summary/PrivacyPreservingDedup-CLOUD'17.html )
 26. *Distributed Key Generation for Encrypted Deduplication: Achieving the Strongest Privacy*----CCSW'14 ([link]( https://dl.acm.org/doi/abs/10.1145/2664168.2664169 )) [summary](https://yzr95924.github.io/paper_summary/DistributedKeyGen-CCSW'14.html)
 27. *Proofs of Ownership on Encrypted Cloud Data via Intel SGX*----ACNS'20 ([link](https://link.springer.com/chapter/10.1007/978-3-030-61638-0_22)) [summary](https://yzr95924.github.io/paper_summary/PoWSGX-ACNS'20.html)
+28. *Accelerating Encrypted Deduplication via SGX*----USENIX ATC'21
+29. *S2Dedup: SGX-enabled Secure Deduplication*----SYSTOR'21
 
 ### Computation Deduplication
 
@@ -281,9 +283,10 @@ In this repo, it records some paper related to storage system, including **Data
 5. *Varys: Protecting SGX Enclaves From Practical Side-Channel Attacks*---USENIX ATC'18 ([link](https://www.usenix.org/system/files/conference/atc18/atc18-oleksenko.pdf))
 6. *sgx-perf: A Performance Analysis Tool for Intel SGX Enclaves*----Middleware'18 ([link](https://www.ibr.cs.tu-bs.de/users/weichbr/papers/middleware2018.pdf)) [summary]( https://yzr95924.github.io/paper_summary/SGXPerf-Middleware'18.html )
 7. *TaLoS: Secure and Transparent TLS Termination inside SGX Enclaves*----arxiv'17 ([link](https://www.doc.ic.ac.uk/~fkelbert/papers/talos17.pdf)) [summary](https://yzr95924.github.io/paper_summary/talos-arxiv'17.html)
-8. *Switchless Calls Made Practical in Intel SGX*----SysTex'18 ([link](https://dl.acm.org/doi/pdf/10.1145/3268935.3268942)) 
+8. *Switchless Calls Made Practical in Intel SGX*----SysTex'18 ([link](https://dl.acm.org/doi/pdf/10.1145/3268935.3268942)) [summary](https://yzr95924.github.io/paper_summary/SwitchLess-SysTEX'18.html)
 9. *Regaining Lost Seconds: Efficient Page Preloading for SGX Enclaves*----Middleware'20 ([link](https://dl.acm.org/doi/pdf/10.1145/3423211.3425673))
 10. *Everything You Should Know About Intel SGX Performance on Virtualized Systems*----Sigmeterics'19 ([link](https://dl.acm.org/doi/pdf/10.1145/3322205.3311076)) [summary](https://yzr95924.github.io/paper_summary/SGXPerformance-SIGMETRICS'19.html)
+11. *A Comparison Study of Intel SGX and AMD Memory Encryption Technology*---HASP'18 ([link](https://dl.acm.org/doi/abs/10.1145/3214292.3214301))
 
 ### SGX Storage
 
@@ -296,12 +299,14 @@ In this repo, it records some paper related to storage system, including **Data
 7. SeGShare: Secure Group File Sharing in the Cloud using Enclaves----DSN'20 ([link](http://www.fkerschbaum.org/dsn20.pdf)) [summary](https://yzr95924.github.io/paper_summary/SeGShare-DSN'20.html)
 8. *DISKSHIELD: A Data Tamper-Resistant Storage for Intel SGX*----AsiaCCS'20 ([link](https://dl.acm.org/doi/pdf/10.1145/3320269.3384717))
 9. *SPEED: Accelerating Enclave Applications via Secure Deduplication*----ICDCS'19 ([link](https://conferences.computer.org/icdcs/2019/pdfs/ICDCS2019-49XpIlu3rRtYi2T0qVYnNX/5DGHpUvuZKbyIr6VRJc0zW/5PfoKBVnBKUPCcy8ruoayx.pdf)) [summary](https://yzr95924.github.io/paper_summary/SPEED-ICDCS'19.html)
+12. *Secure In-memory Key-Value Storage with SGX*----SoCC'18
+13. *EnclaveCache: A Secure and Scalable Key-value Cache in Multi-tenant Clouds using Intel SGX*----Middleware'19 ([link](https://dl.acm.org/doi/pdf/10.1145/3361525.3361533) [summary](https://yzr95924.github.io/paper_summary/EnclaveCache-Middleware'19.html)
 
 ### Network Security
 
 1. *A Privacy-Preserving Defense Mechanism Against Request Forgery Attacks*----TrustCom'11 ([link](https://www.cse.cuhk.edu.hk/~pclee/www/pubs/trustcom11.pdf)) [summary]( https://yzr95924.github.io/paper_summary/DeRef-TrustCom'11.html )
 
-## D. Others
+## D. General Storage
 ### Multi-Cloud System
 1. *Kurma: Secure Geo-Distributed Multi-Cloud Storage Gateways*----SYSTOR'19 [summary](https://yzr95924.github.io/paper_summary/Kurma-SYSTOR'19.html)
 2. *SPANStore: Cost-Effective Geo-Replicated Storage Spanning Multiple Cloud Services*----SOSP'13 [summary](https://yzr95924.github.io/paper_summary/SPANStore-SOSP'13.html)
@@ -312,7 +317,7 @@ In this repo, it records some paper related to storage system, including **Data
 
 1. *In Search of an Understandable Consensus Algorithm*----USENIX ATC'14
 
-### Distributed File System
+### Storage System
 1. *Ceph: A Salable, High-Performance Distributed File System*----OSDI'06 
 2. *The Hadoop Distributed File System*----MSST'10 ([link](http://storageconference.us/2010/Papers/MSST/Shvachko.pdf)) [summary](https://yzr95924.github.io/paper_summary/HDFS-MSST'10.html)
 3. *RADOS: A Scalable, Reliable Storage Service for Petabyte-scale Storage Clusters*----PDSW'07
@@ -321,10 +326,17 @@ In this repo, it records some paper related to storage system, including **Data
 6. *The Google File System*----SOSP'03
 7. *Bigtable: A Distributed Storage System for Structured Data*----OSDI'06
 
+### Cache
+
+1. *TinyLFU: A Highly Efficient Cache Admission Policy*----ACM ToS'17 
+
+
 ### Hash
 1. *Compare-by-Hash: A Reasoned Analysis*----USENIX ATC'06 ([link](https://www.usenix.org/legacy/event/usenix06/tech/full_papers/black/black.pdf)) [summary](https://yzr95924.github.io/paper_summary/CompareByHash-ATC'06.html)
 2. *An Analysis of Compare-by-Hash*----HotOS'03 ([link](http://www.cs.utah.edu/~shanth/stuff/research/dup_elim/hash_cmp.pdf))
 
 ### Streaming Process
 1. *A Lock-Free, Cache-Efficient Multi-Core Synchronization Mechanism for Line-Rate Network Traffic Monitoring*----IPDPS'10
 
+
+
diff --git a/StoragePaperNote/Security/SGX-Storage/EnclaveCache-Middleware'19.md b/StoragePaperNote/Security/SGX-Storage/EnclaveCache-Middleware'19.md
@@ -0,0 +1,108 @@
+---
+typora-copy-images-to: ../paper_figure
+---
+EnclaveCache: A Secure and Scalable Key-Value Cache in Multi-tenant Clouds using Intel SGX
+------------------------------------------
+|           Venue            |       Category       |
+| :------------------------: | :------------------: |
+| Middleware'19 | SGX Storage |
+[TOC]
+
+## 1. Summary
+### Motivation of this paper
+- Motivation 
+  - In-memory key-value caches such as Redis and Memcached have been widely used to speed up web application and reduce the burden on backend database.
+  - Data security is still a major concern, which affects the adoption of cloud caches (multi-tenant environment)
+    - co-located malicious tenants 
+    - the untrusted cloud provider
+- Limitation of existing approaches 
+  - virtualization and containerization technologies
+    - achieved tenant isolation at the cost of system scalability, resource contention
+  - adopt property-preserving encryption to enable query processing over encrypted data
+    - suffer from high computation overhead and information leakage
+- Threat model
+  - multiple mutually distrust parties in a multi-tenant cloud environment
+  - privileged adversary can access the data stored outside the trusted environment
+  - malicious tenants may make spurious access to increase their cache hit rate, and evict the data of co-located tenants out of memory
+
+### EnclaveCache
+
+- Main idea
+  - enforce data isolation among co-located tenants using multiple SGX
+  - securely guard the encryption key of each tenant by the enclave
+  - key question: how to utilize SGX enclaves to realize secure key-value caches within the limited trusted memory
+    - remains an open question
+- Key design decisions
+  - tenant isolation
+    - allow multiple tenants to share a single cache instance, and `each tenant gets a separate enclave as a secret container`
+  - data protection
+    - plaintext data only stays inside enclaves to get serialized, deserialized and processed, and the data is encrypted one it leaves the enclave.
+- Cache isolation
+  - application container: support un-modified applications inside enclaves (`bad scalability`)
+    - e.g., SCONE
+  - data container: hosting only each tenant's data in a dedicated enclave (`oversubscribe the SGX resources`)
+  - secret container: storing only the sensitive information as well as the critical code into enclaves (`this paper design`)
+- Architecture 
+  - ![image-20210529210747569](../paper_figure/image-20210529210747569.png)
+  - The TLS connection is terminated inside the enclave
+  - **Encryption engine** inside then secret enclave is responsible for encrypting the sensitive fields of the requests passed from the TLS server endpoint.
+  - The encryption key used by the encryption engine is acquired by the Key Request Module (KRM) from a Key Distribution Center (KDC).
+    - via SGX remote attestation
+- Key distribution and management
+  - Each tenant is bound with a unique *encryption key* for the encryption/decryption of tenant's data stored outside the enclave.
+  - Every newly-created secret enclave has to go through RA procedure to be attested and provisioned
+    - the encryption key can be stored securely and persistently in the local disk
+      - SGX sealing mechanism
+- Query processing
+  - only the sensitive fields of a message, such as the key/value field, need to be protected via encryption.
+    - the IV for encryption is computed from the SHA-256 hash of each sensitive field
+    - the IV and the MAC is appended to the ciphertext to be used at the time of decryption
+  - bind the key and value
+    - appends the hash of the key to its corresponding value, and the encryption is then performed on the newly generated value 
+      - to against the attacher to replace the encrypted value.
+  - query with the encrypted key 
+    - forward to the request handler
+
+### Implementation and Evaluation
+- Implementation
+  - mbedtls-sgx: AES-128, SHA-256
+  - Tenant isolation
+    - per-tenant LRU for shared multi-tenant cache management strategy
+      - the same account of data is bound to be evicted from each tenant
+    - bind each tenant with a logical database to enable the per-tenant LRU strategy
+  - switchless call to optimize the performance
+
+- Evaluation 
+  - four instances: redis + stunnel, EnclaveCache + switchless, EnclaveCache, Graphene-SGX + redis
+  - YCSB benchmark suite
+  - 1. throughput
+  - 2. hotspots analysis
+    - using Intel VTune amplifier
+  - 3. latency
+    - for requests with large values, the performance of it decreases greatly, mainly due to the increased computation overhead for cryptography operations
+  - 4. scalability
+  - 5. cache fairness
+
+
+
+
+## 2. Strength (Contributions of the paper)
+
+- leverage trusted hardware to solve the problem of **tenant isolation** and **data protection** in multi-tenant clouds.
+- adopts fine-grained, tenant-specific key-value encryption in SGX enclaves to `overcome the limit of SGX`.
+- Extensive evaluation
+  - better performance, higher scalability than running native, unmodified applications in the enclaves
+
+## 3. Weakness (Limitations of the paper)
+
+- Issues of encrypted data stored outside the enclaves
+  - malicious adversaries can delete or re-insert previous key-value pair 
+  - the operation types, key access frequencies and hashed-key distributions are also visible and exploitable.
+
+## 4. Some Insights (Future work)
+
+- Security issues in multi-tenants environment 
+  - the multi-tenant environment may expose users' sensitive data to the other co-located, possibly malicious tenants
+  - the cloud platform provider itself cannot be considered trusted
+- SGX attach surface
+  - the attack surface with SGX enclaves is significantly reduced to only the `processor` and `the software inside enclaves`.
diff --git a/StoragePaperNote/Security/SGX-Technique/SwitchLess-SysTEX'18.md b/StoragePaperNote/Security/SGX-Technique/SwitchLess-SysTEX'18.md
@@ -0,0 +1,92 @@
+---
+typora-copy-images-to: ../paper_figure
+---
+Switchless Calls Made Practical in Intel SGX
+------------------------------------------
+|           Venue            |       Category       |
+| :------------------------: | :------------------: |
+| SysTEX'18 | SGX-Technique |
+[TOC]
+
+## 1. Summary
+### Motivation of this paper
+
+- Motivation
+  - One primary performance overhead is `enclave switches`, which are expensive and can be triggered frequently by cross-enclave function calls.
+    - the overhead of an ECall and OCall is over 8000 CPU cycles, which is > 50x more expensive than that of a system call.
+    - Previous works propose switchless calls, avoids enclave switches by using **worker** threads to execute function calls **asynchronously**.
+  - It argues this technique is questionable in terms of efficiency
+    - It is always wise to trade extra CPU cores for reduced enclave switches?
+- Existing work
+  - Switchless calls: caller thread send the request of ECalls/OCalls into **shared, untrusted buffers**, from which the requests are received and processed asynchronously by worker threads.
+    - thus requiring extra CPU cores
+  - is questionable in face of diverse and dynamic workloads encountered by real-world applications
+
+### Switchless calls
+
+- Main idea:
+  - trade as few as extra CPU cores as possible for as many reduced enclave switches as possible
+    - under different usage patterns and changing runtime workloads
+  - determine on what conditions can switchless calls improve the performance efficiently
+  - insights:
+    - ECall/OCalls should be executed as Switchless Calls if they are `short and called frequently`
+
+- Performance model
+
+  - assume the implementation of switchless calls adopts the `busy-wait` approach
+    - the caller thread must wait for the response in **a busy loop**
+    - pushing requests to a shared queue
+  - model
+    - the time spent inside the enclave $T_t$
+    - the time spent outside the enclave $T_u$
+    - the time to do an enclave switch $T_{es}$
+    - insights:
+      - busy-wait switchless OCalls outperform the traditional OCalls if and only if the $T_t + T_u \leq T_{es}$
+
+- Efficiency-based worker scheduling algorithm 
+
+  - strike a good balance between performance speedup and power conservation
+  - determine, at any point in time, the optimal number of workers so that the performance speedup of the callers is maximized while the wasted CPU cycles of the worker are minimized
+  - worker efficiency
+    - The CPU time saved by the worker / The CPU time consumed by the worker = $\frac{X \cdot T_{es}}{T}$
+    - has a **positive linear relationship** with the throughput speedup
+    - reflects the trade-off between the extra CPU cores and the reduced enclave switches
+  - Algorithm:
+    - maximizing the number of worker threads under the constraints of an upper bound on the number of worker threads and a lower bound on the average worker efficiency.
+      - self-adaptive: determine the optimal number of worker threads at any point in time
+      - user-configurable: make an explicit tradeoff between performance and energy conservation
+      - negligible overhead: require **collecting a few basic statistics** at runtime, thus incurring virtually no runtime overhead
+    - adjust the number of running worker threads by **sleeping or waking up threads**
+      - current average worker efficiency < the expected worker efficiency 
+        - sleep some threads
+      - current average worker efficiency > the expected worker efficiency
+        - wake up some threads
+
+### Implementation and Evaluation
+- Implementation (in Intel SGX SDK)
+  - adopt `busy-wait` switchless approach 
+  - maintain a fixed size thread pool for workers 
+  - Easy to use: label in EDL file
+  - Customizable worker management
+    - support register callback function to handle worker events
+
+- Evaluation
+  - Static workloads
+    - empty ECalls/OCalls
+    - sgx_fwrite
+  - Dynamic workloads
+
+## 2. Strength (Contributions of the paper)
+
+- the first work gives an in-depth performance analysis of switchless calls
+- propose a self-adaptive worker scheduling algorithm to automatically determine the number of workers 
+  - strike a good balance between performance and energy conservation
+
+## 3. Weakness (Limitations of the paper)
+
+## 4. Some Insights (Future work)
+
+- SGX background
+  - SGX is considered a promising hardware-based isolation technology, especially for **protecting security-sensitive workloads** on public clouds.
+  - Due to the high cost of enclave switches, it is problematic for **system-intensive** workloads.
+
diff --git a/StoragePaperNote/TinyLFU-ToS'17.md b/StoragePaperNote/TinyLFU-ToS'17.md
@@ -0,0 +1,35 @@
+---
+typora-copy-images-to: ../paper_figure
+---
+TinyLFU: A Highly Efficient Cache Admission Policy
+------------------------------------------
+|           Venue            |       Category       |
+| :------------------------: | :------------------: |
+| ToS'17 | Cache |
+[TOC]
+
+## 1. Summary
+### Motivation of this paper
+- Perfect LFU (PLFU): is an **optimal** policy when the access distribution is **static**
+  - Limitations:
+    - the cost of maintaining a complete frequency histogram for all data items ever accessed is prohibitively high.  
+    - cannot adapt to dynamic changes in the distribution.	
+
+### Method Name
+
+### Implementation and Evaluation
+
+## 2. Strength (Contributions of the paper)
+
+## 3. Weakness (Limitations of the paper)
+
+## 4. Some Insights (Future work)
+- Locality
+  - characterize the access frequency of all possible data items through a **probability distribution**
+    - the probability distribution is highly skewed (a small number of objects are much more likely to be accessed than other objects).
+- Least frequently used (LFU)
+  - When the probability distribution of the data access pattern **constant** over time, it is easy to show that the LFU yields the highest cache hit ratio.
+- LRU v.s. LFU
+  - LRU can be implemented much more efficiently than LFU.
+    - LRU can automatically adapt to temporal changes in the data access patterns and to bursts on the workloads.
+
diff --git a/paper_figure/image-20210529210747569.png b/paper_figure/image-20210529210747569.png