diff --git a/README.md b/README.md
index bea343c..609f2f7 100644
--- a/README.md
+++ b/README.md
@@ -56,6 +56,7 @@ A reading list related to storage systems, including data deduplication, erasure
22. *The Dilemma between Deduplication and Locality: Can Both be Achieved?*---FAST'21 ([link](https://www.usenix.org/system/files/fast21-zou.pdf)) [summary](https://yzr95924.github.io/paper_summary/MFDedup-FAST'21.html)
23. *SLIMSTORE: A Cloud-based Deduplication System for Multi-version Backups*----ICDE'21 ([link](http://www.cs.utah.edu/~lifeifei/papers/slimstore-icde21.pdf))
24. *Improving the Performance of Deduplication-Based Backup Systems via Container Utilization Based Hot Fingerprint Entry Distilling*----ACM TOS'21 ([link](https://dl.acm.org/doi/full/10.1145/3459626))
+25. *BURST: A Chunk-Based Data Deduplication System with Burst-Encoded Fingerprint Matching*----MSST'24 ([link](https://www.msstconference.org/MSST-history/2024/Papers/msst24-1.2.pdf))
### Restore Performances
@@ -158,6 +159,7 @@ A reading list related to storage systems, including data deduplication, erasure
22. *Palantir: Hierarchical Similarity Detection for Post-Deduplication Delta Compression*----ASPLOS'24 ([link](https://qiangsu97.github.io/files/asplos24spring-final6.pdf))
23. *DedupSearch: Two-Phase Deduplication Aware Keyword Search*----FAST'22 ([link](https://www.usenix.org/system/files/fast22-elias.pdf)) [summary](https://yzr95924.github.io/paper_summary/DedupSearch-FAST'22.html)
24. *Physical vs. Logical Indexing with IDEA: Inverted Deduplication-Aware Index*----FAST'24 ([link](https://www.usenix.org/system/files/fast24-levi.pdf)) [summary](https://yzr95924.github.io/paper_summary/IDEA-FAST'24.html)
+25. *Is Low Similarity Threshold A Bad Idea in Delta Compression?*----HotStorage'24 ([link](https://henryhxu.github.io/share/hongming-hotstorage24.pdf))
### Memory && Block-Layer Deduplication
@@ -361,6 +363,11 @@ A reading list related to storage systems, including data deduplication, erasure
1. *How the Great Firewall of China Detects and Blocks Fully Encrypted Traffic*----USENIX Security'23 ([link](https://people.cs.umass.edu/~amir/papers/UsenixSecurity23_Encrypted_Censorship.pdf))
## General Storage
+
+### HDD, SMR
+
+1. *Revisiting HDD Rules of Thumb: 1/3 Is Not (Quite) the Average Seek Distance*----MSST'24 ([link](https://www.msstconference.org/MSST-history/2024/Papers/msst24-1.1.pdf))
+
### Distributed Storage System
1. *MapReduce: Simplified Data Processing on Large Clusters*----OSDI'04 ([link](https://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf))
2. *Cumulus: Filesystem Backup to the Cloud*----FAST'09 ([link](https://www.usenix.org/legacy/event/fast09/tech/full_papers/vrable/vrable.pdf)) [summary](https://yzr95924.github.io/paper_summary/Cumulus-FAST'09.html)
@@ -381,14 +388,15 @@ A reading list related to storage systems, including data deduplication, erasure
1. *TinyLFU: A Highly Efficient Cache Admission Policy*----ACM TOS'17 ([link](https://arxiv.org/pdf/1512.00727.pdf))
2. *Hyperbolic Caching: Flexible Caching for Web Applications*----USENIX ATC'17 ([link](https://www.cs.princeton.edu/~mfreed/docs/hyperbolic-atc17.pdf))
-3. *It’s Time to Revisit LRU vs. FIFO*----HotStorage'20 ([link](https://www.usenix.org/system/files/hotstorage20_paper_eytan.pdf)) [summary](https://yzr95924.github.io/paper_summary/Cache-HotStorage'20.html) [trace](http://iotta.snia.org/traces/key-value)
-4. *The CacheLib Caching Engine: Design and Experiences at Scale*----OSDI'20 ([link](https://www.usenix.org/system/files/osdi20-berg.pdf))
-5. *Unifying the Data Center Caching Layer — Feasible? Profitable?*----HotStorage'21 ([link](https://dl.acm.org/doi/pdf/10.1145/3465332.3470884))
-6. *Learning Cache Replacement with Cacheus*----FAST'21 ([link](https://www.usenix.org/system/files/fast21-rodriguez.pdf))
-7. *Kangaroo: Caching Billions of Tiny Objects on Flash*----SOSP'21 ([link](https://jasony.me/publications/sosp21-kangaroo.pdf))
-8. *Segcache: a Memory-efficient and Scalable In-memory Key-value Cache for Small Objects*----NSDI'21 ([link](https://jasony.me/publications/nsdi21-segcache.pdf))
-9. *FarReach: Write-back Caching in Programmable Switches*----USENIX ATC'23 ([link](http://www.cse.cuhk.edu.hk/~pclee/www/pubs/atc23.pdf))
-10. *FIFO can be Better than LRU: the Power of Lazy Promotion and Quick Demotion*----HotOS'23 ([link](https://www.pdl.cmu.edu/PDL-FTP/Storage/Yang-FIFO-HotOS23.pdf))
+3. *Flashield: a Hybrid Key-value Cache that Controls Flash Write Amplification*----USENIX NSDI'19 ([link]())
+4. *It’s Time to Revisit LRU vs. FIFO*----HotStorage'20 ([link](https://www.usenix.org/system/files/hotstorage20_paper_eytan.pdf)) [summary](https://yzr95924.github.io/paper_summary/Cache-HotStorage'20.html) [trace](http://iotta.snia.org/traces/key-value)
+5. *The CacheLib Caching Engine: Design and Experiences at Scale*----OSDI'20 ([link](https://www.usenix.org/system/files/osdi20-berg.pdf))
+6. *Unifying the Data Center Caching Layer — Feasible? Profitable?*----HotStorage'21 ([link](https://dl.acm.org/doi/pdf/10.1145/3465332.3470884))
+7. *Learning Cache Replacement with Cacheus*----FAST'21 ([link](https://www.usenix.org/system/files/fast21-rodriguez.pdf))
+8. *Kangaroo: Caching Billions of Tiny Objects on Flash*----SOSP'21 ([link](https://jasony.me/publications/sosp21-kangaroo.pdf))
+9. *Segcache: a Memory-efficient and Scalable In-memory Key-value Cache for Small Objects*----NSDI'21 ([link](https://jasony.me/publications/nsdi21-segcache.pdf))
+10. *FarReach: Write-back Caching in Programmable Switches*----USENIX ATC'23 ([link](http://www.cse.cuhk.edu.hk/~pclee/www/pubs/atc23.pdf))
+11. *FIFO can be Better than LRU: the Power of Lazy Promotion and Quick Demotion*----HotOS'23 ([link](https://www.pdl.cmu.edu/PDL-FTP/Storage/Yang-FIFO-HotOS23.pdf))
### Hash
@@ -402,18 +410,15 @@ A reading list related to storage systems, including data deduplication, erasure
1. *A Lock-Free, Cache-Efficient Multi-Core Synchronization Mechanism for Line-Rate Network Traffic Monitoring*----IPDPS'10 ([link](https://www.cse.cuhk.edu.hk/~pclee/www/pubs/ipdps10.pdf))
2. *Lock-Free Collaboration Support for Cloud Storage Services with Operation Inference and Transformation*----FAST'20 ([link](https://www.usenix.org/system/files/fast20-chen_jian.pdf))
-### SSD, NVMe
+### SSD, Flash
1. *Design Tradeoffs for SSD Performance*----USENIX ATC'08 ([link](https://www.usenix.org/legacy/events/usenix08/tech/full_papers/agrawal/agrawal.pdf))
1. *Design Tradeoffs for SSD Reliability*----USENIX ATC'19 ([link](https://www.usenix.org/system/files/fast19-kim-bryan.pdf))
1. *The Tail at Store: A Revelation from Millions of Hours of Disk and SSD Deployments*----FAST'16 ([link](https://www.usenix.org/system/files/conference/fast16/fast16-papers-hao.pdf))
1. *The Unwritten Contract of Solid State Drives*----EuroSys'17 ([link](https://dl.acm.org/doi/pdf/10.1145/3064176.3064187))
-1. *ZNS: Avoiding the Block Interface Tax for Flash-based SSDs*----USENIX ATC'21 ([link](https://www.usenix.org/system/files/atc21-bjorling.pdf)) [code](https://github.com/westerndigitalcorporation/zenfs)
-1. *ZNS+: Advanced Zoned Namespace Interface for Supporting In-Storage Zone Compaction*----OSDI'21 ([link](https://www.usenix.org/system/files/osdi21-han.pdf))
1. *The CASE of FEMU: Cheap, Accurate, Scalable and Extensible Flash Emulator*----FAST'18 ([link](https://www.usenix.org/system/files/conference/fast18/fast18-li.pdf)) [summary](https://yzr95924.github.io/paper_summary/FEMU-FAST'18.html)
1. *From blocks to rocks: a natural extension of zoned namespaces*----HotStorage'21 ([link](https://dl.acm.org/doi/pdf/10.1145/3465332.3470870))
1. *Don’t Be a Blockhead: Zoned Namespaces Make Work on Conventional SSDs Obsolete*----HotOS'21 ([link](https://sigops.org/s/conferences/hotos/2021/papers/hotos21-s07-stavrinos.pdf)) [summary](https://yzr95924.github.io/paper_summary/BlockHead-HotOS'21.html)
-1. Zone Append: A New Way of Writing to Zoned Storage----Vault'20 ([link](https://www.usenix.org/system/files/vault20_slides_bjorling.pdf))
1. *What Systems Researchers Need to Know about NAND Flash*----HotStorage'13 ([link](https://www.usenix.org/system/files/conference/hotstorage13/hotstorage13-desnoyers.pdf))
1. *Caveat-Scriptor: Write Anywhere Shingled Disks*----HotStorage'15 ([link](https://www.usenix.org/system/files/conference/hotstorage15/hotstorage15-kadekodi.pdf))
1. *Towards an Unwritten Contract of Intel Optane SSD*----HotStorage'19 ([link](https://www.usenix.org/system/files/hotstorage19-paper-wu-kan.pdf))
@@ -428,28 +433,21 @@ A reading list related to storage systems, including data deduplication, erasure
1. *NVMeVirt: A Versatile Software-defined Virtual NVMe Device*----FAST'23 ([link](https://www.usenix.org/system/files/fast23-kim.pdf))
1. *Excessive SSD-Internal Parallelism Considered Harmful*----HotStorage'23 ([link](https://dl.acm.org/doi/pdf/10.1145/3599691.3603412))
1. *Is Garbage Collection Overhead Gone? Case study of F2FS on ZNS SSDs*----HotStorage'23 ([link](https://dl.acm.org/doi/pdf/10.1145/3599691.3603409))
-
-### File system
-
-1. *Scale and Concurrency of GIGA+: File System Directories with Millions of Files*----FAST''11 ([link](https://www.usenix.org/legacy/event/fast11/tech/full_papers/PatilNew.pdf))
-2. *Journaling of Journal Is (Almost) Free*----FAST'14 ([link](https://www.usenix.org/system/files/conference/fast14/fast14-paper_shen.pdf))
-3. *F2FS: A New File System for Flash Storage*----FAST'15 ([link](https://www.usenix.org/system/files/conference/fast15/fast15-paper-lee.pdf))
-4. *POSIX is Dead! Long Live... errr... What Exactly?*----HotStorage'15 ([link](https://www.fsl.cs.stonybrook.edu/docs/cosy-hotos/hotstorage17posux.pdf))
-5. *BetrFS: A Right-Optimized Write-Optimized File System*----FAST'15 ([link](https://www.usenix.org/system/files/conference/fast15/fast15-paper-jannen_william.pdf))
-6. *File Systems Fated for Senescence? Nonsense, Says Science!*----FAST'17 ([link](https://www.usenix.org/system/files/conference/fast17/fast17-conway.pdf))
-7. *To FUSE or Not to FUSE: Performance of User-Space File Systems*----FAST'17 ([link](https://www.usenix.org/system/files/conference/fast17/fast17-vangoor.pdf))
-8. *iJournaling: Fine-Grained Journaling for Improving the Latency of Fsync System Call*----USENIX ATC'17 ([link](https://www.usenix.org/system/files/conference/atc17/atc17-park.pdf))
-9. *The Full Path to Full-Path Indexing*----FAST'18 ([link](https://www.usenix.org/system/files/conference/fast18/fast18-zhan.pdf))
-10. *SplitFS: persistent-memory file system that reduces software overhead*----SOSP'19 ([link](https://www.cs.utexas.edu/~vijay/papers/sosp19-splitfs.pdf))
-11. *EROFS: A Compression-friendly Readonly File System for Resource-scarce Devices*----USENIX ATC'19 ([link](https://www.usenix.org/system/files/atc19-gao.pdf))
-12. *Performance and Resource Utilization of FUSE User-Space File Systems*----ACM TOS'19 ([link](https://dl.acm.org/doi/10.1145/3310148))
-13. *Filesystem Aging: It's more Usage than Fullness*----HotStorage'19 ([link](https://www.cs.unc.edu/~porter/pubs/hotstorage19-paper-conway.pdf))
-14. *How to Copy Files*----FAST'20 ([link](https://www.usenix.org/system/files/fast20-zhan.pdf))
-15. *XFUSE: An Infrastructure for Running Filesystem Services in User Space*----USENIX ATC'21 ([link](https://www.usenix.org/system/files/atc21-huai.pdf))
-16. *WineFS: a hugepage-aware file system for persistent memory that ages gracefully*----SOSP'21 ([link](https://www.cs.utexas.edu/~vijay/papers/winefs-sosp21.pdf))
-17. *LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism*----SOSP'21 ([link](https://dl.acm.org/doi/pdf/10.1145/3477132.3483565))
-18. *BetrFS: A Compleat File System for Commodity SSDs*----EuroSys'22 ([link](https://dl.acm.org/doi/pdf/10.1145/3492321.3519571))
-19. *Survey of Distributed File System Design Choices*----ACM TOS'22 ([link](https://dl.acm.org/doi/pdf/10.1145/3465405))
+1. *ZapRAID: Toward High-Performance RAID for ZNS SSDs via Zone Append*----ApSys'23 ([link](https://www.cse.cuhk.edu.hk/~pclee/www/pubs/apsys23.pdf))
+1. *BypassD: Enabling fast userspace access to shared SSDs*----ASPLOS'24 ([link](https://dl.acm.org/doi/pdf/10.1145/3617232.3624854))
+
+### Open-Channel SSD, ZNS, SMR
+
+1. *LightNVM: The Linux Open-Channel SSD Subsystem*----USENIX FAST'17 ([link](https://www.usenix.org/system/files/conference/fast17/fast17-bjorling.pdf))
+2. *ZoneAlloy: Elastic Data and Space Management for Hybrid SMR Drives*----HotStorage'19 ([link](https://www.usenix.org/system/files/hotstorage19-paper-wu-fenggang.pdf))
+3. *Zone Append: A New Way of Writing to Zoned Storage*----Vault'20 ([link](https://www.usenix.org/system/files/vault20_slides_bjorling.pdf))
+4. *ZNS: Avoiding the Block Interface Tax for Flash-based SSDs*----USENIX ATC'21 ([link](https://www.usenix.org/system/files/atc21-bjorling.pdf)) [code](https://github.com/westerndigitalcorporation/zenfs)
+5. *ZNS+: Advanced Zoned Namespace Interface for Supporting In-Storage Zone Compaction*----OSDI'21 ([link](https://www.usenix.org/system/files/osdi21-han.pdf))
+6. *RAIZN: Redundant Array of Independent Zoned Namespaces*----ASPLOS'23 ([link](https://dl.acm.org/doi/pdf/10.1145/3575693.3575746))
+7. *An Efficient Order-Preserving Recovery for F2FS with ZNS SSD*----HotStorage'23 ([link](https://www.hotstorage.org/2023/papers/hotstorage23-final108.pdf))
+8. *Is Garbage Collection Overhead Gone? Case study of F2FS on ZNS SSDs*----HotStorage'23 ([link](https://huaicheng.github.io/p/hotstorage23-zgc.pdf))
+9. *A Free-Space Adaptive Runtime Zone-Reset Algorithm for Enhanced ZNS Efficiency*----HotStorage'23 ([link](https://discos.sogang.ac.kr/file/2023/intl_conf/HotStorage_2023_S_Byeon.pdf))
+10. *Can ZNS SSDs be Better Storage Devices for Persistent Cache?*----HotStorage'24 ([link](https://dl.acm.org/doi/pdf/10.1145/3655038.3665946)) [summary](https://yzr95924.github.io/paper_summary/ZNS_SSD_Cache-HotStorage'24.html)
### Non-volatile Memory
@@ -518,14 +516,71 @@ A reading list related to storage systems, including data deduplication, erasure
1. *GPFS: A Shared-Disk File System for Large Computing Clusters*----FAST'02 ([link](https://www.usenix.org/legacy/publications/library/proceedings/fast02/full_papers/schmuck/schmuck.pdf))
2. *Efficient Object Storage Journaling in a Distributed Parallel File System*----FAST'10 ([link](https://www.usenix.org/legacy/events/fast10/tech/full_papers/oral.pdf))
-3. *Taking back control of HPC file systems with Robinhood Policy Engine*----arxiv'15 ([link](https://arxiv.org/abs/1505.01448))
-4. *Lustre Lockahead: Early Experience and Performance using Optimized Locking*----CUG'17 ([link](https://cug.org/proceedings/cug2017_proceedings/includes/files/pap141s2-file1.pdf))
-5. *LPCC: Hierarchical Persistent Client Caching for Lustre*----SC'19 ([link](https://dl.acm.org/doi/pdf/10.1145/3295500.3356139)) [slides](https://sc19.supercomputing.org/proceedings/tech_paper/tech_paper_files/pap112s5.pdf)
-6. *A Performance Study of Lustre File System Checker: Bottlenecks and Potentials*----MSST'19 ([link](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8890077&casa_token=uy7uU5C8DQ4AAAAA:9Sp-zG-QWKhgkn5QkmpxDTuHmGljhJJEoq_c9bzVSYb9gUD5eXk2orJYhnvLdQE0HY3RaIRG_9zDYA))
-7. *I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning*----ICPP'19 ([link](https://dl.acm.org/doi/pdf/10.1145/3337821.3337902))
-8. *HadaFS: A File System Bridging the Local and Shared Burst Buffer for Exascale Supercomputers*----FAST'23 ([link](https://www.usenix.org/system/files/fast23-he.pdf))
-9. *Accelerating I/O performance of ZFS-based Lustre file system in HPC environment*----Journal of Supercomputing'23 ([link](https://link.springer.com/article/10.1007/s11227-022-04966-7))
-10. *MetaWBC: POSIX-compliant Metadata Write-back Caching for Distributed File Systems*----SC'22 ([link](https://dl.acm.org/doi/pdf/10.5555/3571885.3571959))
-11. *Xfast: Extreme File Attribute Stat Acceleration for Lustre*----SC'23 ([link](https://dl.acm.org/doi/10.1145/3581784.3607080)) [slides](http://lustrefs.cn/wp-content/uploads/2023/11/CLUG2023_12_Emoly_Liu_Qian_Yingjin_Xfast_Extreme_File_Attribute_Stat_Acceleration_for_Lustre.pdf)
-12. *The I/O Trace Initiative: Building a Collaborative I/O Archive to Advance HPC*----SC-workshop'23 ([link](https://salkhordeh.de/publication/trace-pdsw/trace-pdsw.pdf))
-13. *Combining Buffered I/O and Direct I/O in Distributed File Systems*----FAST'24 ([link](https://www.usenix.org/system/files/fast24-qian.pdf)) [slides](https://www.usenix.org/system/files/fast24_slides-qian.pdf) [summary](https://yzr95924.github.io/paper_summary/Lustre_BIO_DIO-FAST'24.html)
+3. *Tips and Tricks for Diagnosing Lustre Problems on Cray Systems*----CUG'11 ([link](https://cug.org/5-publications/proceedings_attendee_lists/CUG11CD/pages/1-program/final_program/Wednesday/12A-Spitz-Paper.pdf))
+4. *Lustre Resiliency: Understanding Lustre Message Loss and Tuning for Resiliency*----CUG'15 ([link](https://cug.org/proceedings/cug2015_proceedings/includes/files/pap101.pdf))
+5. *Taking back control of HPC file systems with Robinhood Policy Engine*----arxiv'15 ([link](https://arxiv.org/abs/1505.01448))
+6. *Lustre Lockahead: Early Experience and Performance using Optimized Locking*----CUG'17 ([link](https://cug.org/proceedings/cug2017_proceedings/includes/files/pap141s2-file1.pdf))
+7. *LPCC: Hierarchical Persistent Client Caching for Lustre*----SC'19 ([link](https://dl.acm.org/doi/pdf/10.1145/3295500.3356139)) [slides](https://sc19.supercomputing.org/proceedings/tech_paper/tech_paper_files/pap112s5.pdf)
+8. *A Performance Study of Lustre File System Checker: Bottlenecks and Potentials*----MSST'19 ([link](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8890077&casa_token=uy7uU5C8DQ4AAAAA:9Sp-zG-QWKhgkn5QkmpxDTuHmGljhJJEoq_c9bzVSYb9gUD5eXk2orJYhnvLdQE0HY3RaIRG_9zDYA))
+9. *I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning*----ICPP'19 ([link](https://dl.acm.org/doi/pdf/10.1145/3337821.3337902))
+10. *HadaFS: A File System Bridging the Local and Shared Burst Buffer for Exascale Supercomputers*----FAST'23 ([link](https://www.usenix.org/system/files/fast23-he.pdf))
+11. *Accelerating I/O performance of ZFS-based Lustre file system in HPC environment*----Journal of Supercomputing'23 ([link](https://link.springer.com/article/10.1007/s11227-022-04966-7))
+12. *MetaWBC: POSIX-compliant Metadata Write-back Caching for Distributed File Systems*----SC'22 ([link](https://dl.acm.org/doi/pdf/10.5555/3571885.3571959))
+13. *Xfast: Extreme File Attribute Stat Acceleration for Lustre*----SC'23 ([link](https://dl.acm.org/doi/10.1145/3581784.3607080)) [slides](http://lustrefs.cn/wp-content/uploads/2023/11/CLUG2023_12_Emoly_Liu_Qian_Yingjin_Xfast_Extreme_File_Attribute_Stat_Acceleration_for_Lustre.pdf)
+14. *The I/O Trace Initiative: Building a Collaborative I/O Archive to Advance HPC*----SC-workshop'23 ([link](https://salkhordeh.de/publication/trace-pdsw/trace-pdsw.pdf))
+15. *Combining Buffered I/O and Direct I/O in Distributed File Systems*----FAST'24 ([link](https://www.usenix.org/system/files/fast24-qian.pdf)) [slides](https://www.usenix.org/system/files/fast24_slides-qian.pdf) [summary](https://yzr95924.github.io/paper_summary/Lustre_BIO_DIO-FAST'24.html)
+
+## File System
+
+### File Fragmentation
+
+1. *The Effects of Filesystem Fragmentation*----OLS'06 ([link](https://www.landley.net/kdocs/ols/2006/ols2006v1-pages-193-208.pdf))
+2. *Ext4 Block and Inode Allocator Improvements*----OLS'08 ([link](https://www.kernel.org/doc/ols/2008/ols2008v1-pages-263-274.pdf))
+3. *File Systems Fated for Senescence? Nonsense, Says Science!*----FAST'17 ([link](https://www.usenix.org/system/files/conference/fast17/fast17-conway.pdf))
+4. *Filesystem Aging: It's more Usage than Fullness*----HotStorage'19 ([link](https://www.cs.unc.edu/~porter/pubs/hotstorage19-paper-conway.pdf))
+
+### File System Analysis
+
+1. *Understanding Configuration Dependencies of File Systems*----HotStorage'22 ([link](https://www.hotstorage.org/2022/camera-ready/hotstorage22-132/pdf/hotstorage22-132.pdf))
+2. *CONFD: Analyzing Configuration Dependencies of File Systems for Fun and Profit*----FAST'24 ([link](https://www.usenix.org/system/files/fast23-mahmud.pdf))
+
+### Journaling
+
+1. *Journaling of Journal Is (Almost) Free*----FAST'14 ([link](https://www.usenix.org/system/files/conference/fast14/fast14-paper_shen.pdf))
+1. *iJournaling: Fine-Grained Journaling for Improving the Latency of Fsync System Call*----USENIX ATC'17 ([link](https://www.usenix.org/system/files/conference/atc17/atc17-park.pdf))
+1. *FastCommit: Resource-efficient, Performant and Cost-effective File System Journaling*----USENIX ATC'24 ([link](https://www.usenix.org/system/files/atc24-shirwadkar.pdf))
+
+### Page Cache
+
+1. *StreamCache: Revisiting Page Cache for File Scanning on Fast Storage Devices*----USENIX ATC'24 ([link](https://www.usenix.org/system/files/atc24-li-zhiyue.pdf))
+
+### System Design
+
+1. *The Linear Tape File System*----MSST'10 ([link](https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=55becb668bc6cbf0c13b09caa92b849246c36882))
+2. *Scale and Concurrency of GIGA+: File System Directories with Millions of Files*----FAST''11 ([link](https://www.usenix.org/legacy/event/fast11/tech/full_papers/PatilNew.pdf))
+3. *F2FS: A New File System for Flash Storage*----FAST'15 ([link](https://www.usenix.org/system/files/conference/fast15/fast15-paper-lee.pdf))
+4. *POSIX is Dead! Long Live... errr... What Exactly?*----HotStorage'15 ([link](https://www.fsl.cs.stonybrook.edu/docs/cosy-hotos/hotstorage17posux.pdf))
+5. *BetrFS: A Right-Optimized Write-Optimized File System*----FAST'15 ([link](https://www.usenix.org/system/files/conference/fast15/fast15-paper-jannen_william.pdf))
+6. *The Full Path to Full-Path Indexing*----FAST'18 ([link](https://www.usenix.org/system/files/conference/fast18/fast18-zhan.pdf))
+7. *SplitFS: persistent-memory file system that reduces software overhead*----SOSP'19 ([link](https://www.cs.utexas.edu/~vijay/papers/sosp19-splitfs.pdf))
+8. *EROFS: A Compression-friendly Readonly File System for Resource-scarce Devices*----USENIX ATC'19 ([link](https://www.usenix.org/system/files/atc19-gao.pdf))
+9. *How to Copy Files*----FAST'20 ([link](https://www.usenix.org/system/files/fast20-zhan.pdf))
+10. *WineFS: a hugepage-aware file system for persistent memory that ages gracefully*----SOSP'21 ([link](https://www.cs.utexas.edu/~vijay/papers/winefs-sosp21.pdf))
+11. *LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism*----SOSP'21 ([link](https://dl.acm.org/doi/pdf/10.1145/3477132.3483565))
+12. *BetrFS: A Compleat File System for Commodity SSDs*----EuroSys'22 ([link](https://dl.acm.org/doi/pdf/10.1145/3492321.3519571))
+
+### FUSE
+
+1. *To FUSE or Not to FUSE: Performance of User-Space File Systems*----FAST'17 ([link](https://www.usenix.org/system/files/conference/fast17/fast17-vangoor.pdf))
+2. *Performance and Resource Utilization of FUSE User-Space File Systems*----ACM TOS'19 ([link](https://dl.acm.org/doi/10.1145/3310148))
+3. *XFUSE: An Infrastructure for Running Filesystem Services in User Space*----USENIX ATC'21 ([link](https://www.usenix.org/system/files/atc21-huai.pdf))
+
+### Survey
+
+1. *Survey of Distributed File System Design Choices*----ACM TOS'22 ([link](https://dl.acm.org/doi/pdf/10.1145/3465405))
+
+## Storage + AI
+
+### LLM in Storage
+
+1. *Can Modern LLMs Tune and Configure LSM-based Key-Value Stores?*----HotStorage'24 ([link](https://asu-idi.github.io/publications/files/HS24_GPT_Project.pdf))
diff --git a/paper_figure/image-20240805014222417.png b/paper_figure/image-20240805014222417.png
new file mode 100644
index 0000000..4eeb080
Binary files /dev/null and b/paper_figure/image-20240805014222417.png differ
diff --git a/paper_figure/image-20240807011652739.png b/paper_figure/image-20240807011652739.png
new file mode 100644
index 0000000..64b8a00
Binary files /dev/null and b/paper_figure/image-20240807011652739.png differ
diff --git a/storage_paper_note/deduplication/post_dedup/IDEA-FAST'24.md b/storage_paper_note/deduplication/post_dedup/IDEA-FAST'24.md
new file mode 100644
index 0000000..532673a
--- /dev/null
+++ b/storage_paper_note/deduplication/post_dedup/IDEA-FAST'24.md
@@ -0,0 +1,195 @@
+---
+typora-copy-images-to: ../paper_figure
+---
+# Physical vs. Logical Indexing with IDEA: Inverted Deduplication-Aware Index
+
+| Venue | Category |
+| :------------------------: | :------------------: |
+| FAST'24 | Deduplicated System Design, Post-Deduplication Management |
+[TOC]
+
+## 1. Summary
+### Motivation of this paper
+
+- motivation
+ - indexing deduplicated data might result in extreme inefficiencies
+ - index size
+ - proportion to the logical data size, **regardless of its deduplication ratio**
+ - each term must point to all the files containing it, **even if the files' content is almost identical**
+ - index creation overhead
+ - random and redundant accesses to the physical chunks
+ - **term indexing** is not supported by any deduplicating storage system
+ - focus on **textual data**
+ - VMware vSphere and Commvault only support file indexing
+ - identifies individual files within a backup based on metadata
+ - Dell-EMC Data Protection Search
+ - support full content indexing
+ - warn: processing the full content of a large number of files can be **time consuming**
+ - recommend performing targeted indexing on **specific backups and file types**
+- challenge
+ - two separate trends
+ - the growing need to process **cold data** (e.g., old backups)
+ - e.g., full-system scans, keyword searches --> deduplication-aware search
+ - the growing application of deduplication on primary storage of hot and warm data
+ - e.g., perform single-term searches for files within deduplicated personal workstation
+ - indexing software on file-system level --> **unaware** of the underlying deduplication at the storage system
+ - index size
+ - increase --> increase the latency of lookups
+ - index time
+ - scan all files in the system --> random IOs, high read amplification
+ - split terms
+ - chunking process will likely split the incoming data into chunks (at **arbitrary position**)
+ - splitting words between adjacent chunks
+
+### IDEA
+
+- 
+
+- key idea
+ - map terms to the unique physical chunks they appear in
+ - instead of the logical documents (disproportionately high)
+ - replace term-to-file mapping with
+ - term-to-chunk map
+ - chunk-to-file map (file ID)
+ - only need to modify chunking process in deduplication system
+ - **white-space aware** --> enforce chunk boundaries only between words
+- white-space aligned chunking
+ - content-defined chunking
+ - **continue scanning** the following characters until a white-space character is encountered
+ - fixed-size chunking
+ - **backward scanning** this chunk until a white-space character is encountered
+ - resulting chunks are always smaller than the fixed size --> can be stored in a single block
+ - can trim the block in memory to chunk boundary
+ - non-textual content
+ - only to chunking of **textual content**
+ - identify textual content by the file extension of the incoming data
+ - .c, .h, and .htm
+ - add a Boolean field to the metadata of each chunk in the file recipe and container
+ - only process chunks marked as textual
+- term-to-chunk mapping
+ - number of documents in the index --> number of physical chunks
+ - might be higher than the number of logical files
+ - chunks are **read sequentially**, each chunk is processed only once
+ - processing chunks is easily parallelizable
+
+ - lookup
+ - return the fingerprints of the chunks this term appears
+
+- chunk-to-file mapping
+ - two complementing maps
+ - chunk-to-file map
+ - chunk fingerprint --> file IDs
+ - file-to-path map
+ - file IDs --> file's full pathname
+ - created from the metadata in the file recipe
+
+- keyword/term lookup
+ - step-1: yield the fingerprints of all the relevant chunks
+ - step-2: a series of lookups in the chunk-to-file map
+ - retrieves the IDs of all files containing these chunks
+ - step-3: a lookup of each file ID in the file-to-path map
+ - returns the final list of file names
+- ranking results
+ - extend IDEA to support document ranking with the TF-IDF metric
+
+### Implementation and Evaluation
+
+- implementation
+ - LucenePlusPlus + Destor
+ - use Lucene term-to-doc map
+ - 
+ - scan all file receipes from Destor
+ - create the list of files containing each chunk using a key-value store
+ - use an SSD for the data structures which are external to Lucene
+- experimental setup
+ - trace
+ - 
+
+ - hardware
+ - maps of all index alternatives were stored on a separate HDD
+ - chunk-to-file and file-to-path maps of IDEA were stored on a SSD
+
+- evaluation
+ - baseline
+ - traditional deduplication-oblivious indexing (Naive)
+
+ - indexing time
+ - the reduction is proportional to the **deduplication ratio**
+ - recipe-processing time is negligible compared to the chunk-processing time
+
+ - indexing time of IDEA is shorter than that of Naive by 49% to 76%
+
+ - index size
+ - Naive must record more files for all the terms include in them
+ - IDEA additional information is recorded per chunk, not per term
+
+ - lookup times
+ - is faster than Naive by up to 82%
+ - smaller size of its term-to-doc map
+ - incur shorter lookup latency
+
+ - IDEA overhead
+ - IDEA has no advantage when compared to deduplication-oblivious indexing
+ - additional layer of indirection incurs **non-negligible overheads are masked** where the deduplication ratio is sufficiently high
+
+
+## 2. Strength (Contributions of the paper)
+
+- first design of a deduplication-aware term index
+- implementation of IDEA on Lucene
+ - open-source single-node inverted index used by the Elasticsearch
+- extensive evaluation
+
+## 3. Weakness (Limitations of the paper)
+
+- trace is not very large
+- files containing compressed text (.pdf, .docx)
+ - their textual content can only be processed after the file is opened by a suitable application or converted by a dedicated tool
+ - individual chunks cannot be processed during offline index creation
+
+## 4. Some Insights (Future work)
+
+- deduplication scenarios
+ - backup and archival systems
+ - log-structured manner: chunk --> containers
+ - content-defined chunking
+ - primary (non-backup) storage system and appliances
+ - support direct access to individual chunks
+ - fixed-sized chunking
+ - align the deduplicated chunks with the storage interface
+- deduplication data management
+ - implicit sharing of content between files, complicates the followings: transforms logically-sequential data accesses to random IOs in the underlying physical media
+ - GC
+ - load balancing between volumes
+ - caching
+ - charge-back
+- term indexing: **term-to-file** indexing (map)
+ - 
+ - return the files containing **a keyword** or **term**
+ - search engines, data analytics
+ - searched data might be deduplicated
+ - e.g. Elasticsearch
+ - built on top of the single-node Apache Lucene
+ - based on a hierarchy of skip-lists
+ - other variations
+ - Amazon OpenSearch, IBM Watson
+ - keyword: any searchable strings (natural language words)
+ - query
+ - the list of files containing this keyword
+ - optional: byte offsets in which the term appears
+ - indexing creation
+ - collect the documents
+ - identify the terms within each document
+ - normalize the terms
+ - create the list of documents, and optionally offsets, containing each term
+ - result ranking
+ - using a **scoring formula** on each result
+ - TF-IDF
+ - 
+- deduplication basic
+ - file recipe
+ - a list of chunks' fingerprints, their sizes
+ - restore: locate the chunk by searching in the fingerprint map or cache of its entries
+ - pack the **compressed data** into containers
+- standard storage functionality
+ - can be made more efficient by taking advantage of deduplicated state
diff --git a/storage_paper_note/general_storage/OC_ZNS/ZNS_SSD_Cache-HotStorage'24.md b/storage_paper_note/general_storage/OC_ZNS/ZNS_SSD_Cache-HotStorage'24.md
new file mode 100644
index 0000000..87b8a95
--- /dev/null
+++ b/storage_paper_note/general_storage/OC_ZNS/ZNS_SSD_Cache-HotStorage'24.md
@@ -0,0 +1,140 @@
+---
+typora-copy-images-to: ../paper_figure
+---
+# Can ZNS SSDs be Better Storage Devices for Persistent Cache?
+
+| Venue | Category |
+| :------------------------: | :------------------: |
+| HotStorage'24 | ZNS SSDs, Cache |
+[TOC]
+
+## 1. Summary
+### Motivation of this paper
+
+- motivation
+ - existing works mainly focus on cache data on block-based regular SSDs
+ - widely used as storage backends for **persistent cache**
+ - caching workload are **write- and update-intensive** with high capacity utilization
+ - incurs a large amount of **device-level write amplification** (WA)
+ - with many random and small writes to SSDs
+ - internal garbage collection (GC)
+ - SSD lifespan and performance issues
+ - ZNS SSDs
+ - two advantages
+ - need much lower internal over-provisioning --> larger capacity
+ - a better overall cache hit ratio
+ - new interfaces --> potential to reduce WA
+- problem
+ - explore three possible schemes to adapt the existing persistent cache system on ZNS SSDs
+ - utilize **CacheLib** as a general cache framework
+
+### ZNS SSDs in Persistent Cache
+
+- three possible schemes
+ - 
+ - **File-Cache**
+ - run CacheLib on a ZNS-compatible file system (F2FS)
+ - FS handle all low-level operations management
+ - **Zone-Cache**
+ - directly maps the cache on-disk management unit (i.e., region) to the fixed-size zone
+ - achieve true zero WA and be GC-free
+ - **Region-Cache**
+ - a simple middle layer to translate the zone interface to the region interface
+ - needs GC to clean the zones
+- File-Cache
+ - ZNS SSD can be formatted with a compatible file system
+ - zone allocation, zone cleaning with GC, and indexing handled by FS
+ - **fully transparent** to CacheLib
+ - treat ZNS SSD like a regular device
+ - bad
+ - feasible and convenient, but will **bring explicitly high overhead**
+- Zone-Cache
+ - most of the persistent cache designs
+ - group the newly inserted cache objects into a much larger management unit (fixed-size regions)
+ - reduce WA and improve IO efficiency --> **allocating and evicting large IO units**
+ - enlarge the region size to match the zone size
+ - one region per zone
+ - when a region is evicted, the zone can be directly reset without any data migration
+ - real zero WA
+ - GC-free
+ - no OP is needed for GC
+ - no extra indexing
+ - adding one entry of zone number to the region metadata for IOs
+ - bad
+ - need to match the region to a large zone size (1077MiB in Western Digital ZNS SSD)
+ - evicting a large region --> cause many valid or hot cache objects to be evicted
+ - impact the hit ratio
+ - need a larger region buffer in memory to cache the newly inserted objects
+ - more DRAM space
+ - long allocation time in eviction and a long filling time in insertion
+ - reducing the parallelism effectiveness
+- Region-Cache
+ - add a simple middle layer to translate region to physical zone addresses
+ - data management
+ - region ID --> in-zone addresses
+ - bitmap indicates whether the region is valid in zone
+ - 1024 MiB Zone --> 16MiB region
+ - GC
+ - use a background thread to check the empty zone number and valid data size
+ - GC threshold and the zone selection threshold are configurable
+ - depends on the workloads
+ - opens the design space to further optimize the throughput and WA
+ - co-design between cache management and zone management
+
+### Implementation and Evaluation
+
+- evaluation
+ - setting
+ - flexibility, space efficiency, performance, and WA
+ - compared with CacheLib on regular SSDs (**Block-Cache**)
+
+ - ZNS SSDs
+ - Western Digital Ultrastar DC ZN540 with 904 zones the zone size is 1077MiB
+
+ - regular SSD
+ - 1TiB SN540 SSD
+
+ - overall comparison
+ - 
+ - Zone-Cache has the largest cache size (no OP) --> highest cache hit ratio
+
+ - different OP ratio
+ - tradeoff between throughput and hit ratio
+ - higher WA --> lower throughput
+
+ - end-to-end evaluation with RocksDB
+ - throughput: Region-Cache is highest, Zone-Cache is lowest
+ - ZNS SSDs can give a larger cache size than regular SSDs
+
+
+## 2. Strength (Contributions of the paper)
+
+- ZNS SSDs persistent cache can reduce the tail latency and lower WA compared with regular SSDs
+- ZNS SSDs can be better storage devices for persistent cache
+- Zone-Cache can perform better in the **hit ratio**
+- Region-Cache can perform better in **throughput**
+
+## 3. Weakness (Limitations of the paper)
+
+## 4. Some Insights (Future work)
+
+- open-channel SSDs
+ - separate different data streams into different channels
+ - relieving WA and GC penalties
+- zone-based storage
+ - sequential write and zone-based cleaning constraints
+ - avoid internal GC
+ - GC task can be managed by the applications
+ - write pointer
+ - shift to the start by ***zone reset***
+ - jump to the end of the zone by ***zone finish***
+- CacheLib
+ - a pluggable caching engine developed by Meta
+ - log-structured cache
+ - flash space is partitioned into regions
+ - each region is used to package cache objects with different sizes
+ - **evict entire regions** rather than individual cache objects
+ - region size is configurable, e.g., 16MiB
+ - are designed to use either
+ - a raw regular block device
+ - one large file allocated in a file system (pre-allocated file)