|
8 | 8 | // option. This file may not be copied, modified, or distributed
|
9 | 9 | // except according to those terms.
|
10 | 10 |
|
11 |
| -/*! |
12 |
| - * Collection types. |
13 |
| - */ |
| 11 | +//! Collection types. |
| 12 | +//! |
| 13 | +//! Rust's standard collection library provides efficient implementations of the most common |
| 14 | +//! general purpose programming data structures. By using the standard implementations, |
| 15 | +//! it should be possible for two libraries to communicate without significant data conversion. |
| 16 | +//! |
| 17 | +//! To get this out of the way: you should probably just use `Vec` or `HashMap`. These two |
| 18 | +//! collections cover most use cases for generic data storage and processing. They are |
| 19 | +//! exceptionally good at doing what they do. All the other collections in the standard |
| 20 | +//! library have specific use cases where they are the optimal choice, but these cases are |
| 21 | +//! borderline *niche* in comparison. Even when `Vec` and `HashMap` are technically suboptimal, |
| 22 | +//! they're probably a good enough choice to get started. |
| 23 | +//! |
| 24 | +//! Rust's collections can be grouped into four major categories: |
| 25 | +//! |
| 26 | +//! * Sequences: `Vec`, `RingBuf`, `DList`, `BitV` |
| 27 | +//! * Maps: `HashMap`, `BTreeMap`, `TreeMap`, `TrieMap`, `SmallIntMap`, `LruCache` |
| 28 | +//! * Sets: `HashSet`, `BTreeSet`, `TreeSet`, `TrieSet`, `BitVSet`, `EnumSet` |
| 29 | +//! * Misc: `PriorityQueue` |
| 30 | +//! |
| 31 | +//! # When Should You Use Which Collection? |
| 32 | +//! |
| 33 | +//! These are fairly high-level and quick break-downs of when each collection should be |
| 34 | +//! considered. Detailed discussions of strengths and weaknesses of individual collections |
| 35 | +//! can be found on their own documentation pages. |
| 36 | +//! |
| 37 | +//! ### Use a `Vec` when: |
| 38 | +//! * You want to collect items up to be processed or sent elsewhere later, and don't care about |
| 39 | +//! any properties of the actual values being stored. |
| 40 | +//! * You want a sequence of elements in a particular order, and will only be appending to |
| 41 | +//! (or near) the end. |
| 42 | +//! * You want a stack. |
| 43 | +//! * You want a resizable array. |
| 44 | +//! * You want a heap-allocated array. |
| 45 | +//! |
| 46 | +//! ### Use a `RingBuf` when: |
| 47 | +//! * You want a `Vec` that supports efficient insertion at both ends of the sequence. |
| 48 | +//! * You want a queue. |
| 49 | +//! * You want a double-ended queue (deque). |
| 50 | +//! |
| 51 | +//! ### Use a `DList` when: |
| 52 | +//! * You want a `Vec` or `RingBuf` of unknown size, and can't tolerate inconsistent |
| 53 | +//! performance during insertions. |
| 54 | +//! * You are *absolutely* certain you *really*, *truly*, want a doubly linked list. |
| 55 | +//! |
| 56 | +//! ### Use a `HashMap` when: |
| 57 | +//! * You want to associate arbitrary keys with an arbitrary value. |
| 58 | +//! * You want a cache. |
| 59 | +//! * You want a map, with no extra functionality. |
| 60 | +//! |
| 61 | +//! ### Use a `BTreeMap` when: |
| 62 | +//! * You're interested in what the smallest or largest key-value pair is. |
| 63 | +//! * You want to find the largest or smallest key that is smaller or larger than something |
| 64 | +//! * You want to be able to get all of the entries in order on-demand. |
| 65 | +//! * You want a sorted map. |
| 66 | +//! |
| 67 | +//! ### Use a `TreeMap` when: |
| 68 | +//! * You want a `BTreeMap`, but can't tolerate inconsistent performance. |
| 69 | +//! * You want a `BTreeMap`, but have *very large* keys or values. |
| 70 | +//! * You want a `BTreeMap`, but have keys that are expensive to compare. |
| 71 | +//! * You want a `BTreeMap`, but you accept arbitrary untrusted inputs. |
| 72 | +//! |
| 73 | +//! ### Use a `TrieMap` when: |
| 74 | +//! * You want a `HashMap`, but with many potentially large `uint` keys. |
| 75 | +//! * You want a `BTreeMap`, but with potentially large `uint` keys. |
| 76 | +//! |
| 77 | +//! ### Use a `SmallIntMap` when: |
| 78 | +//! * You want a `HashMap` but with known to be small `uint` keys. |
| 79 | +//! * You want a `BTreeMap`, but with known to be small `uint` keys. |
| 80 | +//! |
| 81 | +//! ### Use the `Set` variant of any of these `Map`s when: |
| 82 | +//! * You just want to remember which keys you've seen. |
| 83 | +//! * There is no meaningful value to associate with your keys. |
| 84 | +//! * You just want a set. |
| 85 | +//! |
| 86 | +//! ### Use a `BitV` when: |
| 87 | +//! * You want to store an unbounded number of booleans in a small space. |
| 88 | +//! * You want a bitvector. |
| 89 | +//! |
| 90 | +//! ### Use a `BitVSet` when: |
| 91 | +//! * You want a `SmallIntSet`. |
| 92 | +//! |
| 93 | +//! ### Use an `EnumSet` when: |
| 94 | +//! * You want a C-like enum, stored in a single `uint`. |
| 95 | +//! |
| 96 | +//! ### Use a `PriorityQueue` when: |
| 97 | +//! * You want to store a bunch of elements, but only ever want to process the "biggest" |
| 98 | +//! or "most important" one at any given time. |
| 99 | +//! * You want a priority queue. |
| 100 | +//! |
| 101 | +//! ### Use an `LruCache` when: |
| 102 | +//! * You want a cache that discards infrequently used items when it becomes full. |
| 103 | +//! * You want a least-recently-used cache. |
| 104 | +//! |
| 105 | +//! # Correct and Efficient Usage of Collections |
| 106 | +//! |
| 107 | +//! Of course, knowing which collection is the right one for the job doesn't instantly |
| 108 | +//! permit you to use it correctly. Here are some quick tips for efficient and correct |
| 109 | +//! usage of the standard collections in general. If you're interested in how to use a |
| 110 | +//! specific collection in particular, consult its documentation for detailed discussion |
| 111 | +//! and code examples. |
| 112 | +//! |
| 113 | +//! ## Capacity Management |
| 114 | +//! |
| 115 | +//! Many collections provide several constructors and methods that refer to "capacity". |
| 116 | +//! These collections are generally built on top of an array. Optimally, this array would be |
| 117 | +//! exactly the right size to fit only the elements stored in the collection, but for the |
| 118 | +//! collection to do this would be very inefficient. If the backing array was exactly the |
| 119 | +//! right size at all times, then every time an element is inserted, the collection would |
| 120 | +//! have to grow the array to fit it. Due to the way memory is allocated and managed on most |
| 121 | +//! computers, this would almost surely require allocating an entirely new array and |
| 122 | +//! copying every single element from the old one into the new one. Hopefully you can |
| 123 | +//! see that this wouldn't be very efficient to do on every operation. |
| 124 | +//! |
| 125 | +//! Most collections therefore use an *amortized* allocation strategy. They generally let |
| 126 | +//! themselves have a fair amount of unoccupied space so that they only have to grow |
| 127 | +//! on occasion. When they do grow, they allocate a substantially larger array to move |
| 128 | +//! the elements into so that it will take a while for another grow to be required. While |
| 129 | +//! this strategy is great in general, it would be even better if the collection *never* |
| 130 | +//! had to resize its backing array. Unfortunately, the collection itself doesn't have |
| 131 | +//! enough information to do this itself. Therefore, it is up to us programmers to give it |
| 132 | +//! hints. |
| 133 | +//! |
| 134 | +//! Any `with_capacity` constructor will instruct the collection to allocate enough space |
| 135 | +//! for the specified number of elements. Ideally this will be for exactly that many |
| 136 | +//! elements, but some implementation details may prevent this. `Vec` and `RingBuf` can |
| 137 | +//! be relied on to allocate exactly the requested amount, though. Use `with_capacity` |
| 138 | +//! when you know exactly how many elements will be inserted, or at least have a |
| 139 | +//! reasonable upper-bound on that number. |
| 140 | +//! |
| 141 | +//! When anticipating a large influx of elements, the `reserve` family of methods can |
| 142 | +//! be used to hint to the collection how much room it should make for the coming items. |
| 143 | +//! As with `with_capacity`, the precise behavior of these methods will be specific to |
| 144 | +//! the collection of interest. |
| 145 | +//! |
| 146 | +//! For optimal performance, collections will generally avoid shrinking themselves. |
| 147 | +//! If you believe that a collection will not soon contain any more elements, or |
| 148 | +//! just really need the memory, the `shrink_to_fit` method prompts the collection |
| 149 | +//! to shrink the backing array to the minimum size capable of holding its elements. |
| 150 | +//! |
| 151 | +//! Finally, if ever you're interested in what the actual capacity of the collection is, |
| 152 | +//! most collections provide a `capacity` method to query this information on demand. |
| 153 | +//! This can be useful for debugging purposes, or for use with the `reserve` methods. |
| 154 | +//! |
| 155 | +//! ## Iterators |
| 156 | +//! |
| 157 | +//! Iterators are a powerful and robust mechanism used throughout Rust's standard |
| 158 | +//! libraries. Iterators provide a sequence of values in a generic, safe, efficient |
| 159 | +//! and convenient way. The contents of an iterator are usually *lazily* evaluated, |
| 160 | +//! so that only the values that are actually needed are ever actually produced, and |
| 161 | +//! no allocation need be done to temporarily store them. Iterators are primarily |
| 162 | +//! consumed using a `for` loop, although many functions also take iterators where |
| 163 | +//! a collection or sequence of values is desired. |
| 164 | +//! |
| 165 | +//! All of the standard collections provide several iterators for performing bulk |
| 166 | +//! manipulation of their contents. The three primary iterators almost every collection |
| 167 | +//! should provide are `iter`, `iter_mut`, and `into_iter`. Some of these are not |
| 168 | +//! provided on collections where it would be unsound or unreasonable to provide them. |
| 169 | +//! |
| 170 | +//! `iter` provides an iterator of immutable references to all the contents of a |
| 171 | +//! collection in the most "natural" order. For sequence collections like `Vec`, this |
| 172 | +//! means the items will be yielded in increasing order of index starting at 0. For ordered |
| 173 | +//! collections like `BTreeMap`, this means that the items will be yielded in sorted order. |
| 174 | +//! For unordered collections like `HashMap`, the items will be yielded in whatever order |
| 175 | +//! the internal representation made most convenient. This is great for reading through |
| 176 | +//! all the contents of the collection. |
| 177 | +//! |
| 178 | +//! ``` |
| 179 | +//! let vec = vec![1u, 2, 3, 4]; |
| 180 | +//! for x in vec.iter() { |
| 181 | +//! println!("vec contained {}", x); |
| 182 | +//! } |
| 183 | +//! ``` |
| 184 | +//! |
| 185 | +//! `iter_mut` provides an iterator of *mutable* references in the same order as `iter`. |
| 186 | +//! This is great for mutating all the contents of the collection. |
| 187 | +//! |
| 188 | +//! ``` |
| 189 | +//! let mut vec = vec![1u, 2, 3, 4]; |
| 190 | +//! for x in vec.iter_mut() { |
| 191 | +//! *x += 1; |
| 192 | +//! } |
| 193 | +//! ``` |
| 194 | +//! |
| 195 | +//! `into_iter` transforms the actual collection into an iterator over its contents |
| 196 | +//! by-value. This is great when the collection itself is no longer needed, and the |
| 197 | +//! values are needed elsewhere. Using `extend` with `into_iter` is the main way that |
| 198 | +//! contents of one collection are moved into another. Calling `collect` on an iterator |
| 199 | +//! itself is also a great way to convert one collection into another. Both of these |
| 200 | +//! methods should internally use the capacity management tools discussed in the |
| 201 | +//! previous section to do this as efficiently as possible. |
| 202 | +//! |
| 203 | +//! ``` |
| 204 | +//! let mut vec1 = vec![1u, 2, 3, 4]; |
| 205 | +//! let vec2 = vec![10u, 20, 30, 40]; |
| 206 | +//! vec1.extend(vec2.into_iter()); |
| 207 | +//! ``` |
| 208 | +//! |
| 209 | +//! ``` |
| 210 | +//! use std::collections::RingBuf; |
| 211 | +//! |
| 212 | +//! let vec = vec![1u, 2, 3, 4]; |
| 213 | +//! let buf: RingBuf<uint> = vec.into_iter().collect(); |
| 214 | +//! ``` |
| 215 | +//! |
| 216 | +//! Iterators also provide a series of *adapter* methods for performing common tasks to |
| 217 | +//! sequences. Among the adapters are functional favorites like `map`, `fold`, `skip`, |
| 218 | +//! and `take`. Of particular interest to collections is the `rev` adapter, that |
| 219 | +//! reverses any iterator that supports this operation. Most collections provide reversible |
| 220 | +//! iterators as the way to iterate over them in reverse order. |
| 221 | +//! |
| 222 | +//! ``` |
| 223 | +//! let vec = vec![1u, 2, 3, 4]; |
| 224 | +//! for x in vec.iter().rev() { |
| 225 | +//! println!("vec contained {}", x); |
| 226 | +//! } |
| 227 | +//! ``` |
| 228 | +//! |
| 229 | +//! Several other collection methods also return iterators to yield a sequence of results |
| 230 | +//! but avoid allocating an entire collection to store the result in. This provides maximum |
| 231 | +//! flexibility as `collect` or `extend` can be called to "pipe" the sequence into any |
| 232 | +//! collection if desired. Otherwise, the sequence can be looped over with a `for` loop. The |
| 233 | +//! iterator can also be discarded after partial use, preventing the computation of the unused |
| 234 | +//! items. |
| 235 | +//! |
| 236 | +//! ## Entries |
| 237 | +//! |
| 238 | +//! The `entry` API is intended to provide an efficient mechanism for manipulating |
| 239 | +//! the contents of a map conditionally on the presence of a key or not. The primary |
| 240 | +//! motivating use case for this is to provide efficient accumulator maps. For instance, |
| 241 | +//! if one wishes to maintain a count of the number of times each key has been seen, |
| 242 | +//! they will have to perform some conditional logic on whether this is the first time |
| 243 | +//! the key has been seen or not. Normally, this would require a `find` followed by an |
| 244 | +//! `insert`, effectively duplicating the search effort on each insertion. |
| 245 | +//! |
| 246 | +//! When a user calls `map.entry(key)`, the map will search for the key and then yield |
| 247 | +//! a variant of the `Entry` enum. |
| 248 | +//! |
| 249 | +//! If a `Vacant(entry)` is yielded, then the key *was not* found. In this case the |
| 250 | +//! only valid operation is to `set` the value of the entry. When this is done, |
| 251 | +//! the vacant entry is consumed and converted into a mutable reference to the |
| 252 | +//! the value that was inserted. This allows for further manipulation of the value |
| 253 | +//! beyond the lifetime of the search itself. This is useful if complex logic needs to |
| 254 | +//! be performed on the value regardless of whether the value was just inserted. |
| 255 | +//! |
| 256 | +//! If an `Occupied(entry)` is yielded, then the key *was* found. In this case, the user |
| 257 | +//! has several options: they can `get`, `set`, or `take` the value of the occupied |
| 258 | +//! entry. Additionally, they can convert the occupied entry into a mutable reference |
| 259 | +//! to its value, providing symmetry to the vacant `set` case. |
| 260 | +//! |
| 261 | +//! ### Examples |
| 262 | +//! |
| 263 | +//! Here are the two primary ways in which `entry` is used. First, a simple example |
| 264 | +//! where the logic performed on the values is trivial. |
| 265 | +//! |
| 266 | +//! #### Counting the number of times each character in a string occurs |
| 267 | +//! |
| 268 | +//! ``` |
| 269 | +//! use std::collections::btree::{BTreeMap, Occupied, Vacant}; |
| 270 | +//! |
| 271 | +//! let mut count = BTreeMap::new(); |
| 272 | +//! let message = "she sells sea shells by the sea shore"; |
| 273 | +//! |
| 274 | +//! for c in message.chars() { |
| 275 | +//! match count.entry(c) { |
| 276 | +//! Vacant(entry) => { entry.set(1u); }, |
| 277 | +//! Occupied(mut entry) => *entry.get_mut() += 1, |
| 278 | +//! } |
| 279 | +//! } |
| 280 | +//! |
| 281 | +//! assert_eq!(count.find(&'s'), Some(&8)); |
| 282 | +//! |
| 283 | +//! println!("Number of occurences of each character"); |
| 284 | +//! for (char, count) in count.iter() { |
| 285 | +//! println!("{}: {}", char, count); |
| 286 | +//! } |
| 287 | +//! ``` |
| 288 | +//! |
| 289 | +//! When the logic to be performed on the value is more complex, we may simply use |
| 290 | +//! the `entry` API to ensure that the value is initialized, and perform the logic |
| 291 | +//! afterwards. |
| 292 | +//! |
| 293 | +//! #### Tracking the inebriation of customers at a bar |
| 294 | +//! |
| 295 | +//! ``` |
| 296 | +//! use std::collections::btree::{BTreeMap, Occupied, Vacant}; |
| 297 | +//! |
| 298 | +//! // A client of the bar. They have an id and a blood alcohol level. |
| 299 | +//! struct Person { id: u32, blood_alcohol: f32 }; |
| 300 | +//! |
| 301 | +//! // All the orders made to the bar, by client id. |
| 302 | +//! let orders = vec![1,2,1,2,3,4,1,2,2,3,4,1,1,1]; |
| 303 | +//! |
| 304 | +//! // Our clients. |
| 305 | +//! let mut blood_alcohol = BTreeMap::new(); |
| 306 | +//! |
| 307 | +//! for id in orders.into_iter() { |
| 308 | +//! // If this is the first time we've seen this customer, initialize them |
| 309 | +//! // with no blood alcohol. Otherwise, just retrieve them. |
| 310 | +//! let person = match blood_alcohol.entry(id) { |
| 311 | +//! Vacant(entry) => entry.set(Person{id: id, blood_alcohol: 0.0}), |
| 312 | +//! Occupied(entry) => entry.into_mut(), |
| 313 | +//! }; |
| 314 | +//! |
| 315 | +//! // Reduce their blood alcohol level. It takes time to order and drink a beer! |
| 316 | +//! person.blood_alcohol *= 0.9; |
| 317 | +//! |
| 318 | +//! // Check if they're sober enough to have another beer. |
| 319 | +//! if person.blood_alcohol > 0.3 { |
| 320 | +//! // Too drunk... for now. |
| 321 | +//! println!("Sorry {}, I have to cut you off", person.id); |
| 322 | +//! } else { |
| 323 | +//! // Have another! |
| 324 | +//! person.blood_alcohol += 0.1; |
| 325 | +//! } |
| 326 | +//! } |
| 327 | +//! ``` |
14 | 328 |
|
15 | 329 | #![experimental]
|
16 | 330 |
|
|
0 commit comments