diff --git a/README.md b/README.md
index bea343c..609f2f7 100644
--- a/README.md
+++ b/README.md
@@ -56,6 +56,7 @@ A reading list related to storage systems, including data deduplication, erasure
 22. *The Dilemma between Deduplication and Locality: Can Both be Achieved?*---FAST'21 ([link](https://www.usenix.org/system/files/fast21-zou.pdf)) [summary](https://yzr95924.github.io/paper_summary/MFDedup-FAST'21.html)
 23. *SLIMSTORE: A Cloud-based Deduplication System for Multi-version Backups*----ICDE'21 ([link](http://www.cs.utah.edu/~lifeifei/papers/slimstore-icde21.pdf))
 24. *Improving the Performance of Deduplication-Based Backup Systems via Container Utilization Based Hot Fingerprint Entry Distilling*----ACM TOS'21 ([link](https://dl.acm.org/doi/full/10.1145/3459626))
+25. *BURST: A Chunk-Based Data Deduplication System with Burst-Encoded Fingerprint Matching*----MSST'24 ([link](https://www.msstconference.org/MSST-history/2024/Papers/msst24-1.2.pdf))
 
 ### Restore Performances
 
@@ -158,6 +159,7 @@ A reading list related to storage systems, including data deduplication, erasure
 22. *Palantir: Hierarchical Similarity Detection for Post-Deduplication Delta Compression*----ASPLOS'24 ([link](https://qiangsu97.github.io/files/asplos24spring-final6.pdf))
 23. *DedupSearch: Two-Phase Deduplication Aware Keyword Search*----FAST'22 ([link](https://www.usenix.org/system/files/fast22-elias.pdf)) [summary](https://yzr95924.github.io/paper_summary/DedupSearch-FAST'22.html)
 24. *Physical vs. Logical Indexing with IDEA: Inverted Deduplication-Aware Index*----FAST'24 ([link](https://www.usenix.org/system/files/fast24-levi.pdf)) [summary](https://yzr95924.github.io/paper_summary/IDEA-FAST'24.html)
+25. *Is Low Similarity Threshold A Bad Idea in Delta Compression?*----HotStorage'24 ([link](https://henryhxu.github.io/share/hongming-hotstorage24.pdf))
 
 ### Memory && Block-Layer Deduplication
 
@@ -361,6 +363,11 @@ A reading list related to storage systems, including data deduplication, erasure
 1. *How the Great Firewall of China Detects and Blocks Fully Encrypted Traffic*----USENIX Security'23 ([link](https://people.cs.umass.edu/~amir/papers/UsenixSecurity23_Encrypted_Censorship.pdf))
 
 ## General Storage
+
+### HDD, SMR
+
+1. *Revisiting HDD Rules of Thumb: 1/3 Is Not (Quite) the Average Seek Distance*----MSST'24 ([link](https://www.msstconference.org/MSST-history/2024/Papers/msst24-1.1.pdf))
+
 ### Distributed Storage System
 1. *MapReduce: Simplified Data Processing on Large Clusters*----OSDI'04 ([link](https://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-osdi04.pdf))
 2. *Cumulus: Filesystem Backup to the Cloud*----FAST'09 ([link](https://www.usenix.org/legacy/event/fast09/tech/full_papers/vrable/vrable.pdf)) [summary](https://yzr95924.github.io/paper_summary/Cumulus-FAST'09.html)
@@ -381,14 +388,15 @@ A reading list related to storage systems, including data deduplication, erasure
 
 1. *TinyLFU: A Highly Efficient Cache Admission Policy*----ACM TOS'17 ([link](https://arxiv.org/pdf/1512.00727.pdf))
 2. *Hyperbolic Caching: Flexible Caching for Web Applications*----USENIX ATC'17 ([link](https://www.cs.princeton.edu/~mfreed/docs/hyperbolic-atc17.pdf))
-3. *It’s Time to Revisit LRU vs. FIFO*----HotStorage'20 ([link](https://www.usenix.org/system/files/hotstorage20_paper_eytan.pdf)) [summary](https://yzr95924.github.io/paper_summary/Cache-HotStorage'20.html) [trace](http://iotta.snia.org/traces/key-value)
-4. *The CacheLib Caching Engine: Design and Experiences at Scale*----OSDI'20 ([link](https://www.usenix.org/system/files/osdi20-berg.pdf))
-5. *Unifying the Data Center Caching Layer — Feasible? Profitable?*----HotStorage'21 ([link](https://dl.acm.org/doi/pdf/10.1145/3465332.3470884))
-6. *Learning Cache Replacement with Cacheus*----FAST'21 ([link](https://www.usenix.org/system/files/fast21-rodriguez.pdf))
-7. *Kangaroo: Caching Billions of Tiny Objects on Flash*----SOSP'21 ([link](https://jasony.me/publications/sosp21-kangaroo.pdf))
-8. *Segcache: a Memory-efficient and Scalable In-memory Key-value Cache for Small Objects*----NSDI'21 ([link](https://jasony.me/publications/nsdi21-segcache.pdf))
-9. *FarReach: Write-back Caching in Programmable Switches*----USENIX ATC'23 ([link](http://www.cse.cuhk.edu.hk/~pclee/www/pubs/atc23.pdf))
-10. *FIFO can be Better than LRU: the Power of Lazy Promotion and Quick Demotion*----HotOS'23 ([link](https://www.pdl.cmu.edu/PDL-FTP/Storage/Yang-FIFO-HotOS23.pdf))
+3. *Flashield: a Hybrid Key-value Cache that Controls Flash Write Amplification*----USENIX NSDI'19 ([link]())
+4. *It’s Time to Revisit LRU vs. FIFO*----HotStorage'20 ([link](https://www.usenix.org/system/files/hotstorage20_paper_eytan.pdf)) [summary](https://yzr95924.github.io/paper_summary/Cache-HotStorage'20.html) [trace](http://iotta.snia.org/traces/key-value)
+5. *The CacheLib Caching Engine: Design and Experiences at Scale*----OSDI'20 ([link](https://www.usenix.org/system/files/osdi20-berg.pdf))
+6. *Unifying the Data Center Caching Layer — Feasible? Profitable?*----HotStorage'21 ([link](https://dl.acm.org/doi/pdf/10.1145/3465332.3470884))
+7. *Learning Cache Replacement with Cacheus*----FAST'21 ([link](https://www.usenix.org/system/files/fast21-rodriguez.pdf))
+8. *Kangaroo: Caching Billions of Tiny Objects on Flash*----SOSP'21 ([link](https://jasony.me/publications/sosp21-kangaroo.pdf))
+9. *Segcache: a Memory-efficient and Scalable In-memory Key-value Cache for Small Objects*----NSDI'21 ([link](https://jasony.me/publications/nsdi21-segcache.pdf))
+10. *FarReach: Write-back Caching in Programmable Switches*----USENIX ATC'23 ([link](http://www.cse.cuhk.edu.hk/~pclee/www/pubs/atc23.pdf))
+11. *FIFO can be Better than LRU: the Power of Lazy Promotion and Quick Demotion*----HotOS'23 ([link](https://www.pdl.cmu.edu/PDL-FTP/Storage/Yang-FIFO-HotOS23.pdf))
 
 
 ### Hash
@@ -402,18 +410,15 @@ A reading list related to storage systems, including data deduplication, erasure
 1. *A Lock-Free, Cache-Efficient Multi-Core Synchronization Mechanism for Line-Rate Network Traffic Monitoring*----IPDPS'10 ([link](https://www.cse.cuhk.edu.hk/~pclee/www/pubs/ipdps10.pdf))
 2. *Lock-Free Collaboration Support for Cloud Storage Services with Operation Inference and Transformation*----FAST'20 ([link](https://www.usenix.org/system/files/fast20-chen_jian.pdf))
 
-### SSD, NVMe
+### SSD, Flash
 
 1. *Design Tradeoffs for SSD Performance*----USENIX ATC'08 ([link](https://www.usenix.org/legacy/events/usenix08/tech/full_papers/agrawal/agrawal.pdf))
 1. *Design Tradeoffs for SSD Reliability*----USENIX ATC'19 ([link](https://www.usenix.org/system/files/fast19-kim-bryan.pdf))
 1. *The Tail at Store: A Revelation from Millions of Hours of Disk and SSD Deployments*----FAST'16 ([link](https://www.usenix.org/system/files/conference/fast16/fast16-papers-hao.pdf))
 1. *The Unwritten Contract of Solid State Drives*----EuroSys'17 ([link](https://dl.acm.org/doi/pdf/10.1145/3064176.3064187))
-1. *ZNS: Avoiding the Block Interface Tax for Flash-based SSDs*----USENIX ATC'21 ([link](https://www.usenix.org/system/files/atc21-bjorling.pdf)) [code](https://github.com/westerndigitalcorporation/zenfs)
-1. *ZNS+: Advanced Zoned Namespace Interface for Supporting In-Storage Zone Compaction*----OSDI'21 ([link](https://www.usenix.org/system/files/osdi21-han.pdf))
 1. *The CASE of FEMU: Cheap, Accurate, Scalable and Extensible Flash Emulator*----FAST'18 ([link](https://www.usenix.org/system/files/conference/fast18/fast18-li.pdf)) [summary](https://yzr95924.github.io/paper_summary/FEMU-FAST'18.html)
 1. *From blocks to rocks: a natural extension of zoned namespaces*----HotStorage'21 ([link](https://dl.acm.org/doi/pdf/10.1145/3465332.3470870))
 1. *Don’t Be a Blockhead: Zoned Namespaces Make Work on Conventional SSDs Obsolete*----HotOS'21 ([link](https://sigops.org/s/conferences/hotos/2021/papers/hotos21-s07-stavrinos.pdf)) [summary](https://yzr95924.github.io/paper_summary/BlockHead-HotOS'21.html)
-1. Zone Append: A New Way of  Writing to Zoned Storage----Vault'20 ([link](https://www.usenix.org/system/files/vault20_slides_bjorling.pdf))
 1. *What Systems Researchers Need to Know about NAND Flash*----HotStorage'13 ([link](https://www.usenix.org/system/files/conference/hotstorage13/hotstorage13-desnoyers.pdf))
 1. *Caveat-Scriptor: Write Anywhere Shingled Disks*----HotStorage'15 ([link](https://www.usenix.org/system/files/conference/hotstorage15/hotstorage15-kadekodi.pdf))
 1. *Towards an Unwritten Contract of Intel Optane SSD*----HotStorage'19 ([link](https://www.usenix.org/system/files/hotstorage19-paper-wu-kan.pdf))
@@ -428,28 +433,21 @@ A reading list related to storage systems, including data deduplication, erasure
 1. *NVMeVirt: A Versatile Software-defined Virtual NVMe Device*----FAST'23 ([link](https://www.usenix.org/system/files/fast23-kim.pdf))
 1. *Excessive SSD-Internal Parallelism Considered Harmful*----HotStorage'23 ([link](https://dl.acm.org/doi/pdf/10.1145/3599691.3603412))
 1. *Is Garbage Collection Overhead Gone? Case study of F2FS on ZNS SSDs*----HotStorage'23 ([link](https://dl.acm.org/doi/pdf/10.1145/3599691.3603409))
-
-### File system
-
-1. *Scale and Concurrency of GIGA+: File System Directories with Millions of Files*----FAST''11 ([link](https://www.usenix.org/legacy/event/fast11/tech/full_papers/PatilNew.pdf))
-2. *Journaling of Journal Is (Almost) Free*----FAST'14 ([link](https://www.usenix.org/system/files/conference/fast14/fast14-paper_shen.pdf))
-3. *F2FS: A New File System for Flash Storage*----FAST'15 ([link](https://www.usenix.org/system/files/conference/fast15/fast15-paper-lee.pdf))
-4. *POSIX is Dead! Long Live... errr... What Exactly?*----HotStorage'15 ([link](https://www.fsl.cs.stonybrook.edu/docs/cosy-hotos/hotstorage17posux.pdf))
-5. *BetrFS: A Right-Optimized Write-Optimized File System*----FAST'15 ([link](https://www.usenix.org/system/files/conference/fast15/fast15-paper-jannen_william.pdf))
-6. *File Systems Fated for Senescence? Nonsense, Says Science!*----FAST'17 ([link](https://www.usenix.org/system/files/conference/fast17/fast17-conway.pdf))
-7. *To FUSE or Not to FUSE: Performance of  User-Space File Systems*----FAST'17 ([link](https://www.usenix.org/system/files/conference/fast17/fast17-vangoor.pdf))
-8. *iJournaling: Fine-Grained Journaling for Improving the Latency of Fsync System Call*----USENIX ATC'17 ([link](https://www.usenix.org/system/files/conference/atc17/atc17-park.pdf))
-9. *The Full Path to Full-Path Indexing*----FAST'18 ([link](https://www.usenix.org/system/files/conference/fast18/fast18-zhan.pdf))
-10. *SplitFS: persistent-memory file system that reduces software overhead*----SOSP'19 ([link](https://www.cs.utexas.edu/~vijay/papers/sosp19-splitfs.pdf))
-11. *EROFS: A Compression-friendly Readonly File System for Resource-scarce Devices*----USENIX ATC'19 ([link](https://www.usenix.org/system/files/atc19-gao.pdf))
-12. *Performance and Resource Utilization of FUSE User-Space File Systems*----ACM TOS'19 ([link](https://dl.acm.org/doi/10.1145/3310148))
-13. *Filesystem Aging: It's more Usage than Fullness*----HotStorage'19 ([link](https://www.cs.unc.edu/~porter/pubs/hotstorage19-paper-conway.pdf))
-14. *How to Copy Files*----FAST'20 ([link](https://www.usenix.org/system/files/fast20-zhan.pdf))
-15. *XFUSE: An Infrastructure for Running Filesystem Services in User Space*----USENIX ATC'21 ([link](https://www.usenix.org/system/files/atc21-huai.pdf))
-16. *WineFS: a hugepage-aware file system for persistent memory that ages gracefully*----SOSP'21 ([link](https://www.cs.utexas.edu/~vijay/papers/winefs-sosp21.pdf))
-17. *LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism*----SOSP'21 ([link](https://dl.acm.org/doi/pdf/10.1145/3477132.3483565))
-18. *BetrFS: A Compleat File System for Commodity SSDs*----EuroSys'22 ([link](https://dl.acm.org/doi/pdf/10.1145/3492321.3519571))
-19. *Survey of Distributed File System Design Choices*----ACM TOS'22 ([link](https://dl.acm.org/doi/pdf/10.1145/3465405))
+1. *ZapRAID: Toward High-Performance RAID for ZNS SSDs via Zone Append*----ApSys'23 ([link](https://www.cse.cuhk.edu.hk/~pclee/www/pubs/apsys23.pdf))
+1. *BypassD: Enabling fast userspace access to shared SSDs*----ASPLOS'24 ([link](https://dl.acm.org/doi/pdf/10.1145/3617232.3624854))
+
+### Open-Channel SSD, ZNS, SMR
+
+1. *LightNVM: The Linux Open-Channel SSD Subsystem*----USENIX FAST'17 ([link](https://www.usenix.org/system/files/conference/fast17/fast17-bjorling.pdf))
+2. *ZoneAlloy: Elastic Data and Space Management for Hybrid SMR Drives*----HotStorage'19 ([link](https://www.usenix.org/system/files/hotstorage19-paper-wu-fenggang.pdf))
+3. *Zone Append: A New Way of  Writing to Zoned Storage*----Vault'20 ([link](https://www.usenix.org/system/files/vault20_slides_bjorling.pdf))
+4. *ZNS: Avoiding the Block Interface Tax for Flash-based SSDs*----USENIX ATC'21 ([link](https://www.usenix.org/system/files/atc21-bjorling.pdf)) [code](https://github.com/westerndigitalcorporation/zenfs)
+5. *ZNS+: Advanced Zoned Namespace Interface for Supporting In-Storage Zone Compaction*----OSDI'21 ([link](https://www.usenix.org/system/files/osdi21-han.pdf))
+6. *RAIZN: Redundant Array of Independent Zoned Namespaces*----ASPLOS'23 ([link](https://dl.acm.org/doi/pdf/10.1145/3575693.3575746))
+7. *An Efficient Order-Preserving Recovery for F2FS with ZNS SSD*----HotStorage'23 ([link](https://www.hotstorage.org/2023/papers/hotstorage23-final108.pdf))
+8. *Is Garbage Collection Overhead Gone? Case study of F2FS on ZNS SSDs*----HotStorage'23 ([link](https://huaicheng.github.io/p/hotstorage23-zgc.pdf))
+9. *A Free-Space Adaptive Runtime Zone-Reset Algorithm for Enhanced ZNS Efficiency*----HotStorage'23 ([link](https://discos.sogang.ac.kr/file/2023/intl_conf/HotStorage_2023_S_Byeon.pdf))
+10. *Can ZNS SSDs be Better Storage Devices for Persistent Cache?*----HotStorage'24 ([link](https://dl.acm.org/doi/pdf/10.1145/3655038.3665946)) [summary](https://yzr95924.github.io/paper_summary/ZNS_SSD_Cache-HotStorage'24.html)
 
 ### Non-volatile Memory
 
@@ -518,14 +516,71 @@ A reading list related to storage systems, including data deduplication, erasure
 
 1. *GPFS: A Shared-Disk File System for Large Computing Clusters*----FAST'02 ([link](https://www.usenix.org/legacy/publications/library/proceedings/fast02/full_papers/schmuck/schmuck.pdf))
 2. *Efficient Object Storage Journaling in a Distributed Parallel File System*----FAST'10 ([link](https://www.usenix.org/legacy/events/fast10/tech/full_papers/oral.pdf))
-3. *Taking back control of HPC file systems with Robinhood Policy Engine*----arxiv'15 ([link](https://arxiv.org/abs/1505.01448))
-4. *Lustre Lockahead: Early Experience and Performance using Optimized Locking*----CUG'17 ([link](https://cug.org/proceedings/cug2017_proceedings/includes/files/pap141s2-file1.pdf))
-5. *LPCC: Hierarchical Persistent  Client Caching for Lustre*----SC'19 ([link](https://dl.acm.org/doi/pdf/10.1145/3295500.3356139)) [slides](https://sc19.supercomputing.org/proceedings/tech_paper/tech_paper_files/pap112s5.pdf)
-6. *A Performance Study of Lustre File System Checker: Bottlenecks and Potentials*----MSST'19 ([link](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8890077&casa_token=uy7uU5C8DQ4AAAAA:9Sp-zG-QWKhgkn5QkmpxDTuHmGljhJJEoq_c9bzVSYb9gUD5eXk2orJYhnvLdQE0HY3RaIRG_9zDYA))
-7. *I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning*----ICPP'19 ([link](https://dl.acm.org/doi/pdf/10.1145/3337821.3337902))
-8. *HadaFS: A File System Bridging the Local and  Shared Burst Buffer for Exascale Supercomputers*----FAST'23 ([link](https://www.usenix.org/system/files/fast23-he.pdf)) 
-9. *Accelerating I/O performance of ZFS-based Lustre file system in HPC environment*----Journal of Supercomputing'23 ([link](https://link.springer.com/article/10.1007/s11227-022-04966-7))
-10. *MetaWBC: POSIX-compliant Metadata Write-back Caching for Distributed File Systems*----SC'22 ([link](https://dl.acm.org/doi/pdf/10.5555/3571885.3571959))
-11. *Xfast: Extreme File Attribute Stat Acceleration for Lustre*----SC'23 ([link](https://dl.acm.org/doi/10.1145/3581784.3607080)) [slides](http://lustrefs.cn/wp-content/uploads/2023/11/CLUG2023_12_Emoly_Liu_Qian_Yingjin_Xfast_Extreme_File_Attribute_Stat_Acceleration_for_Lustre.pdf)
-12. *The I/O Trace Initiative: Building a Collaborative I/O Archive to Advance HPC*----SC-workshop'23 ([link](https://salkhordeh.de/publication/trace-pdsw/trace-pdsw.pdf))
-13. *Combining Buffered I/O and Direct I/O  in Distributed File Systems*----FAST'24 ([link](https://www.usenix.org/system/files/fast24-qian.pdf)) [slides](https://www.usenix.org/system/files/fast24_slides-qian.pdf) [summary](https://yzr95924.github.io/paper_summary/Lustre_BIO_DIO-FAST'24.html)
+3. *Tips and Tricks for Diagnosing Lustre  Problems on Cray Systems*----CUG'11 ([link](https://cug.org/5-publications/proceedings_attendee_lists/CUG11CD/pages/1-program/final_program/Wednesday/12A-Spitz-Paper.pdf))
+4. *Lustre Resiliency: Understanding Lustre Message Loss and Tuning for Resiliency*----CUG'15 ([link](https://cug.org/proceedings/cug2015_proceedings/includes/files/pap101.pdf))
+5. *Taking back control of HPC file systems with Robinhood Policy Engine*----arxiv'15 ([link](https://arxiv.org/abs/1505.01448))
+6. *Lustre Lockahead: Early Experience and Performance using Optimized Locking*----CUG'17 ([link](https://cug.org/proceedings/cug2017_proceedings/includes/files/pap141s2-file1.pdf))
+7. *LPCC: Hierarchical Persistent  Client Caching for Lustre*----SC'19 ([link](https://dl.acm.org/doi/pdf/10.1145/3295500.3356139)) [slides](https://sc19.supercomputing.org/proceedings/tech_paper/tech_paper_files/pap112s5.pdf)
+8. *A Performance Study of Lustre File System Checker: Bottlenecks and Potentials*----MSST'19 ([link](https://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=8890077&casa_token=uy7uU5C8DQ4AAAAA:9Sp-zG-QWKhgkn5QkmpxDTuHmGljhJJEoq_c9bzVSYb9gUD5eXk2orJYhnvLdQE0HY3RaIRG_9zDYA))
+9. *I/O Characterization and Performance Evaluation of BeeGFS for Deep Learning*----ICPP'19 ([link](https://dl.acm.org/doi/pdf/10.1145/3337821.3337902))
+10. *HadaFS: A File System Bridging the Local and  Shared Burst Buffer for Exascale Supercomputers*----FAST'23 ([link](https://www.usenix.org/system/files/fast23-he.pdf)) 
+11. *Accelerating I/O performance of ZFS-based Lustre file system in HPC environment*----Journal of Supercomputing'23 ([link](https://link.springer.com/article/10.1007/s11227-022-04966-7))
+12. *MetaWBC: POSIX-compliant Metadata Write-back Caching for Distributed File Systems*----SC'22 ([link](https://dl.acm.org/doi/pdf/10.5555/3571885.3571959))
+13. *Xfast: Extreme File Attribute Stat Acceleration for Lustre*----SC'23 ([link](https://dl.acm.org/doi/10.1145/3581784.3607080)) [slides](http://lustrefs.cn/wp-content/uploads/2023/11/CLUG2023_12_Emoly_Liu_Qian_Yingjin_Xfast_Extreme_File_Attribute_Stat_Acceleration_for_Lustre.pdf)
+14. *The I/O Trace Initiative: Building a Collaborative I/O Archive to Advance HPC*----SC-workshop'23 ([link](https://salkhordeh.de/publication/trace-pdsw/trace-pdsw.pdf))
+15. *Combining Buffered I/O and Direct I/O  in Distributed File Systems*----FAST'24 ([link](https://www.usenix.org/system/files/fast24-qian.pdf)) [slides](https://www.usenix.org/system/files/fast24_slides-qian.pdf) [summary](https://yzr95924.github.io/paper_summary/Lustre_BIO_DIO-FAST'24.html)
+
+## File System
+
+### File Fragmentation
+
+1. *The Effects of Filesystem Fragmentation*----OLS'06 ([link](https://www.landley.net/kdocs/ols/2006/ols2006v1-pages-193-208.pdf))
+2. *Ext4 Block and Inode Allocator Improvements*----OLS'08 ([link](https://www.kernel.org/doc/ols/2008/ols2008v1-pages-263-274.pdf))
+3. *File Systems Fated for Senescence? Nonsense, Says Science!*----FAST'17 ([link](https://www.usenix.org/system/files/conference/fast17/fast17-conway.pdf))
+4. *Filesystem Aging: It's more Usage than Fullness*----HotStorage'19 ([link](https://www.cs.unc.edu/~porter/pubs/hotstorage19-paper-conway.pdf))
+
+### File System Analysis
+
+1.  *Understanding Configuration Dependencies of File Systems*----HotStorage'22 ([link](https://www.hotstorage.org/2022/camera-ready/hotstorage22-132/pdf/hotstorage22-132.pdf))
+2.  *CONFD: Analyzing Configuration Dependencies of File Systems for Fun and Profit*----FAST'24 ([link](https://www.usenix.org/system/files/fast23-mahmud.pdf))
+
+### Journaling
+
+1. *Journaling of Journal Is (Almost) Free*----FAST'14 ([link](https://www.usenix.org/system/files/conference/fast14/fast14-paper_shen.pdf))
+1. *iJournaling: Fine-Grained Journaling for Improving the Latency of Fsync System Call*----USENIX ATC'17 ([link](https://www.usenix.org/system/files/conference/atc17/atc17-park.pdf))
+1. *FastCommit: Resource-efficient, Performant and Cost-effective File System Journaling*----USENIX ATC'24 ([link](https://www.usenix.org/system/files/atc24-shirwadkar.pdf))
+
+### Page Cache
+
+1. *StreamCache: Revisiting Page Cache  for File Scanning on Fast Storage Devices*----USENIX ATC'24 ([link](https://www.usenix.org/system/files/atc24-li-zhiyue.pdf))
+
+### System Design
+
+1. *The Linear Tape File System*----MSST'10 ([link](https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=55becb668bc6cbf0c13b09caa92b849246c36882))
+2. *Scale and Concurrency of GIGA+: File System Directories with Millions of Files*----FAST''11 ([link](https://www.usenix.org/legacy/event/fast11/tech/full_papers/PatilNew.pdf))
+3. *F2FS: A New File System for Flash Storage*----FAST'15 ([link](https://www.usenix.org/system/files/conference/fast15/fast15-paper-lee.pdf))
+4. *POSIX is Dead! Long Live... errr... What Exactly?*----HotStorage'15 ([link](https://www.fsl.cs.stonybrook.edu/docs/cosy-hotos/hotstorage17posux.pdf))
+5. *BetrFS: A Right-Optimized Write-Optimized File System*----FAST'15 ([link](https://www.usenix.org/system/files/conference/fast15/fast15-paper-jannen_william.pdf))
+6. *The Full Path to Full-Path Indexing*----FAST'18 ([link](https://www.usenix.org/system/files/conference/fast18/fast18-zhan.pdf))
+7. *SplitFS: persistent-memory file system that reduces software overhead*----SOSP'19 ([link](https://www.cs.utexas.edu/~vijay/papers/sosp19-splitfs.pdf))
+8. *EROFS: A Compression-friendly Readonly File System for Resource-scarce Devices*----USENIX ATC'19 ([link](https://www.usenix.org/system/files/atc19-gao.pdf))
+9. *How to Copy Files*----FAST'20 ([link](https://www.usenix.org/system/files/fast20-zhan.pdf))
+10. *WineFS: a hugepage-aware file system for persistent memory that ages gracefully*----SOSP'21 ([link](https://www.cs.utexas.edu/~vijay/papers/winefs-sosp21.pdf))
+11. *LineFS: Efficient SmartNIC Offload of a Distributed File System with Pipeline Parallelism*----SOSP'21 ([link](https://dl.acm.org/doi/pdf/10.1145/3477132.3483565))
+12. *BetrFS: A Compleat File System for Commodity SSDs*----EuroSys'22 ([link](https://dl.acm.org/doi/pdf/10.1145/3492321.3519571))
+
+### FUSE
+
+1. *To FUSE or Not to FUSE: Performance of  User-Space File Systems*----FAST'17 ([link](https://www.usenix.org/system/files/conference/fast17/fast17-vangoor.pdf))
+2. *Performance and Resource Utilization of FUSE User-Space File Systems*----ACM TOS'19 ([link](https://dl.acm.org/doi/10.1145/3310148))
+3. *XFUSE: An Infrastructure for Running Filesystem Services in User Space*----USENIX ATC'21 ([link](https://www.usenix.org/system/files/atc21-huai.pdf))
+
+### Survey
+
+1. *Survey of Distributed File System Design Choices*----ACM TOS'22 ([link](https://dl.acm.org/doi/pdf/10.1145/3465405))
+
+## Storage + AI
+
+### LLM in Storage
+
+1. *Can Modern LLMs Tune and Configure LSM-based Key-Value Stores?*----HotStorage'24 ([link](https://asu-idi.github.io/publications/files/HS24_GPT_Project.pdf))
diff --git a/paper_figure/image-20240805014222417.png b/paper_figure/image-20240805014222417.png
new file mode 100644
index 0000000..4eeb080
Binary files /dev/null and b/paper_figure/image-20240805014222417.png differ
diff --git a/paper_figure/image-20240807011652739.png b/paper_figure/image-20240807011652739.png
new file mode 100644
index 0000000..64b8a00
Binary files /dev/null and b/paper_figure/image-20240807011652739.png differ
diff --git a/storage_paper_note/deduplication/post_dedup/IDEA-FAST'24.md b/storage_paper_note/deduplication/post_dedup/IDEA-FAST'24.md
new file mode 100644
index 0000000..532673a
--- /dev/null
+++ b/storage_paper_note/deduplication/post_dedup/IDEA-FAST'24.md
@@ -0,0 +1,195 @@
+---
+typora-copy-images-to: ../paper_figure
+---
+# Physical vs. Logical Indexing with IDEA: Inverted Deduplication-Aware Index
+
+|           Venue            |       Category       |
+| :------------------------: | :------------------: |
+| FAST'24 | Deduplicated System Design, Post-Deduplication Management |
+[TOC]
+
+## 1. Summary
+### Motivation of this paper
+
+- motivation
+  - indexing deduplicated data might result in extreme inefficiencies
+    - index size
+      - proportion to the logical data size, **regardless of its deduplication ratio**
+        - each term must point to all the files containing it, <u>**even if the files' content is almost identical**</u>
+    - index creation overhead
+      - random and redundant accesses to the physical chunks
+    -  **term indexing** is not supported by any deduplicating storage system
+      - focus on **textual data**
+      - VMware vSphere and Commvault only support file indexing
+        - identifies individual files within a backup based on metadata
+      - Dell-EMC Data Protection Search
+        - support full content indexing
+          - warn: processing the full content of a large number of files can be **time consuming**
+            - recommend performing targeted indexing on **specific backups and file types**
+- challenge
+  - two separate trends
+    - the growing need to process **cold data** (e.g., old backups)
+      - e.g., full-system scans, keyword searches --> deduplication-aware search
+    - the growing application of deduplication on primary storage of hot and warm data
+      - e.g., perform single-term searches for files within deduplicated personal workstation
+  - indexing software on file-system level --> **unaware** of the underlying deduplication at the storage system
+    - index size
+      - increase --> increase the latency of lookups
+    - index time
+      - scan all files in the system --> random IOs, high read amplification
+    - split terms
+      - chunking process will likely split the incoming data into chunks (at **arbitrary position**)
+        - splitting words between adjacent chunks
+
+### IDEA
+
+- ![image-20240321002025742](./../paper_figure/image-20240321002025742.png)
+
+- key idea
+  - map terms to the unique physical chunks they appear in
+    - instead of the logical documents (disproportionately high)
+    - replace term-to-file mapping with
+      - term-to-chunk map
+      - chunk-to-file map (file ID)
+  - only need to modify chunking process in deduplication system
+    - **white-space aware** --> enforce chunk boundaries only between words
+- white-space aligned chunking
+  - content-defined chunking
+    - **continue scanning** the following characters until a white-space character is encountered
+  - fixed-size chunking
+    - **backward scanning** this chunk until a white-space character is encountered
+      - resulting chunks are always smaller than the fixed size --> can be stored in a single block
+    - can trim the block in memory to chunk boundary
+  - non-textual content
+    - only to chunking of **textual content**
+    - identify textual content by the file extension of the incoming data
+      - .c, .h, and .htm
+    - add a Boolean field to the metadata of each chunk in the file recipe and container
+      - only process chunks marked as textual
+- term-to-chunk mapping
+  - number of documents in the index --> number of physical chunks
+    - might be higher than the number of logical files
+    - chunks are **read sequentially**, each chunk is processed only once
+      - processing chunks is easily parallelizable
+
+  - lookup
+    - return the fingerprints of the chunks this term appears
+
+- chunk-to-file mapping
+  - two complementing maps
+    - chunk-to-file map
+      - chunk fingerprint --> file IDs
+    - file-to-path map
+      - file IDs --> file's full pathname
+  - created from the metadata in the file recipe
+
+- keyword/term lookup
+  - step-1: yield the fingerprints of all the relevant chunks
+  - step-2: a series of lookups in the chunk-to-file map
+    - retrieves the IDs of all files containing these chunks
+  - step-3: a lookup of each file ID in the file-to-path map
+    - returns the final list of file names
+- ranking results
+  - extend IDEA to support document ranking with the TF-IDF metric
+
+### Implementation and Evaluation
+
+- implementation
+  - LucenePlusPlus + Destor
+    - use Lucene term-to-doc map
+    - ![image-20240321204347685](./../paper_figure/image-20240321204347685.png)
+    - scan all file receipes from Destor
+      - create the list of files containing each chunk using a key-value store
+    - use an SSD for the data structures which are external to Lucene
+- experimental setup
+  - trace
+    - ![image-20240321210826877](./../paper_figure/image-20240321210826877.png)
+
+  - hardware
+    - maps of all index alternatives were stored on a separate HDD
+    - chunk-to-file and file-to-path maps of IDEA were stored on a SSD
+
+- evaluation
+  - baseline
+    - traditional deduplication-oblivious indexing (Naive)
+
+  - indexing time
+    - the reduction is proportional to the **deduplication ratio** 
+      - recipe-processing time is negligible compared to the chunk-processing time
+
+    - indexing time of IDEA is shorter than that of Naive by 49% to 76%
+
+  - index size
+    - Naive must record more files for all the terms include in them
+    - IDEA additional information is recorded per chunk, not per term
+
+  - lookup times
+    - is faster than Naive by up to 82%
+    - smaller size of its term-to-doc map
+      - incur shorter lookup latency
+
+  - IDEA overhead
+    - IDEA has no advantage when compared to deduplication-oblivious indexing
+      - additional layer of indirection incurs **non-negligible overheads are masked** <u>where the deduplication ratio is sufficiently high</u>
+
+
+## 2. Strength (Contributions of the paper)
+
+- first design of a deduplication-aware term index
+- implementation of IDEA on Lucene
+  - open-source single-node inverted index used by the Elasticsearch
+- extensive evaluation
+
+## 3. Weakness (Limitations of the paper)
+
+- trace is not very large
+- files containing compressed text (.pdf, .docx)
+  - their textual content can only be processed after the file is opened by a suitable application or converted by a dedicated tool
+  - individual chunks cannot be processed during offline index creation
+
+## 4. Some Insights (Future work)
+
+- deduplication scenarios
+  - backup and archival systems
+    - log-structured manner: chunk --> containers
+    - content-defined chunking
+  - primary (non-backup) storage system and appliances
+    - support direct access to <u>individual chunks</u>
+    - fixed-sized chunking
+      - align the deduplicated chunks with the storage interface
+- deduplication data management
+  - implicit sharing of content between files, complicates the followings: transforms logically-sequential data accesses to random IOs in the underlying physical media
+    - GC
+    - load balancing between volumes
+    - caching
+    - charge-back
+- term indexing: **term-to-file** indexing (map)
+  - ![image-20240321001530743](./../paper_figure/image-20240321001530743.png)
+  - return the files containing **a keyword** or **term**
+    - search engines, data analytics
+    - searched data might be deduplicated
+    - e.g. Elasticsearch
+      - built on top of the single-node Apache Lucene
+        - based on a hierarchy of skip-lists
+      - other variations
+        - Amazon OpenSearch, IBM Watson
+  - keyword: any searchable strings (natural language words)
+  - query
+    - the list of files containing this keyword
+    - optional: byte offsets in which the term appears
+  - indexing creation
+    - collect the documents
+    - identify the terms within each document
+    - normalize the terms
+    - create the list of documents, and optionally offsets, containing each term
+  - result ranking
+    - using a **scoring formula** on each result
+    - TF-IDF
+      - ![image-20240319012231775](./../paper_figure/image-20240319012231775.png)
+- deduplication basic
+  - file recipe
+    - a list of chunks' fingerprints, their sizes
+    - restore: locate the chunk by searching in the fingerprint map or cache of its entries
+  - pack the **compressed data** into containers
+- standard storage functionality
+  - can be made more efficient by taking advantage of deduplicated state
diff --git a/storage_paper_note/general_storage/OC_ZNS/ZNS_SSD_Cache-HotStorage'24.md b/storage_paper_note/general_storage/OC_ZNS/ZNS_SSD_Cache-HotStorage'24.md
new file mode 100644
index 0000000..87b8a95
--- /dev/null
+++ b/storage_paper_note/general_storage/OC_ZNS/ZNS_SSD_Cache-HotStorage'24.md
@@ -0,0 +1,140 @@
+---
+typora-copy-images-to: ../paper_figure
+---
+# Can ZNS SSDs be Better Storage Devices for Persistent Cache?
+
+|           Venue            |       Category       |
+| :------------------------: | :------------------: |
+| HotStorage'24 | ZNS SSDs, Cache |
+[TOC]
+
+## 1. Summary
+### Motivation of this paper
+
+- motivation
+  - existing works mainly focus on cache data on block-based regular SSDs
+    - widely used as storage backends for **persistent cache**
+  - caching workload are **write- and update-intensive** with high capacity utilization
+    - incurs a large amount of **device-level write amplification** (WA)
+      - with many random and small writes to SSDs
+    - internal garbage collection (GC)
+      - SSD lifespan and performance issues
+  - ZNS SSDs 
+    - two advantages
+      - need much lower internal over-provisioning --> larger capacity
+        - a better overall cache hit ratio
+      - new interfaces --> potential to reduce WA
+- problem
+  - <u>explore three possible schemes to adapt the existing persistent cache system on ZNS SSDs</u>
+    - utilize **CacheLib** as a general cache framework
+
+### ZNS SSDs in Persistent Cache
+
+- three possible schemes
+  - ![image-20240805014222417](./../paper_figure/image-20240805014222417.png)
+  - **File-Cache**
+    - run CacheLib on a ZNS-compatible file system (F2FS)
+      - FS handle all low-level operations management
+  - **Zone-Cache**
+    - directly maps the cache on-disk management unit (i.e., region) to the fixed-size zone
+      - achieve true zero WA and be GC-free
+  - **Region-Cache**
+    - a simple middle layer to translate the zone interface to the region interface
+      - needs GC to clean the zones
+- File-Cache
+  - ZNS SSD can be formatted with a compatible file system
+  - zone allocation, zone cleaning with GC, and indexing handled by FS
+    - **fully transparent** to CacheLib
+    - treat ZNS SSD like a regular device
+  - bad
+    - feasible and convenient, but will **bring explicitly high overhead**
+- Zone-Cache
+  - most of the persistent cache designs
+    - group the newly inserted cache objects into a much larger management unit (fixed-size regions)
+      - reduce WA and improve IO efficiency --> **allocating and evicting large IO units**
+  - enlarge the region size to match the zone size
+    - one region per zone
+      - when a region is evicted, the zone can be directly reset without any data migration
+        - real zero WA
+        - GC-free
+          - no OP is needed for GC
+    - no extra indexing
+      - adding one entry of zone number to the region metadata for IOs
+  - bad
+    - need to match the region to a large zone size (1077MiB in Western Digital ZNS SSD)
+      - evicting a large region --> cause many valid or hot cache objects to be evicted
+        - impact the hit ratio
+    - need a larger region buffer in memory to cache the newly inserted objects
+      - more DRAM space
+    - long allocation time in eviction and a long filling time in insertion
+      - reducing the parallelism effectiveness
+- Region-Cache
+  - add a simple middle layer to translate region to physical zone addresses
+  - data management
+    - region ID --> in-zone addresses
+    - bitmap indicates whether the region is valid in zone
+      - 1024 MiB Zone --> 16MiB region
+  - GC
+    - use a background thread to check the empty zone number and valid data size
+      - GC threshold and the zone selection threshold are configurable
+        - depends on the workloads
+  - opens the design space to further optimize the throughput and WA
+    - co-design between cache management and zone management
+
+### Implementation and Evaluation
+
+- evaluation
+  - setting
+    - flexibility, space efficiency, performance, and WA
+      - compared with CacheLib on regular SSDs (**Block-Cache**)
+
+    - ZNS SSDs
+      - Western Digital Ultrastar DC ZN540 with 904 zones the zone size is 1077MiB
+
+    - regular SSD
+      - 1TiB SN540 SSD
+
+  - overall comparison
+    - ![image-20240807011652739](./../paper_figure/image-20240807011652739.png)
+      - Zone-Cache has the largest cache size (no OP) --> highest cache hit ratio
+
+  - different OP ratio
+    - tradeoff between throughput and hit ratio
+      - higher WA --> lower throughput
+
+  - end-to-end evaluation with RocksDB
+    - throughput: Region-Cache is highest, Zone-Cache is lowest
+    - ZNS SSDs can give a larger cache size than regular SSDs
+
+
+## 2. Strength (Contributions of the paper)
+
+- ZNS SSDs persistent cache can reduce the tail latency and lower WA compared with regular SSDs
+- ZNS SSDs can be better storage devices for persistent cache
+- Zone-Cache can perform better in the **hit ratio**
+- Region-Cache can perform better in **throughput**
+
+## 3. Weakness (Limitations of the paper)
+
+## 4. Some Insights (Future work)
+
+- open-channel SSDs
+  - separate different data streams into different channels
+    - relieving WA and GC penalties
+- zone-based storage
+  - sequential write and zone-based cleaning constraints
+    - avoid internal GC
+    - GC task can be managed by the applications
+  - write pointer
+    - shift to the start by ***zone reset***
+    - jump to the end of the zone by ***zone finish***
+- CacheLib
+  - a pluggable caching engine developed by Meta
+  - log-structured cache
+    - flash space is partitioned into regions
+    - each region is used to package cache objects with different sizes
+  - **evict entire regions** rather than individual cache objects
+    - region size is configurable, e.g., 16MiB
+  - are designed to use either
+    - a raw regular block device
+    - one large file allocated in a file system (pre-allocated file)