Skip to content

Frequently Asked Questions

Robbie Hanson edited this page May 16, 2015 · 24 revisions

The standard FAQ page. As you submit 'em, this page keeps getting bigger...

 


### Are there any recommended "best practices"?

There are a few best practices you should follow in order to prevent "shooting yourself in the foot". These best practices follow naturally from a basic understanding of how YapDatabase works. Thus, I'll highlight the basics first, and then list the associated best practices.

For more detail, see the Performance Primer article.

  • Every YapDatabaseConnection only supports a single transaction at a time.

Essentially, each YapDatabaseConnection has an internal serial queue. All transactions on the connection must go through the connection's serial queue. This includes both read-only transactions and read-write transactions. It also includes async transactions.

This means that connections are thread-safe. That is, you can use safely use a single connection in multiple threads. But do not mistake thread-safety for concurrency. Thread-safe != concurrency.

A read-write transaction on connectionA will block read-only transactions on connectionA until the read-write transaction completes. (Even if its an asyncReadWrite transaction.)

You can get a similar effect if you have a really really expensive read-only transaction. For example, you loop over every object in the database and perform some expensive complex math for each one. If you do this expensive read-only transaction on connectionA in a background thread, you're still blocking connectionA for other threads.

  • Concurrency comes through using multiple connections.

Concurrency in YapDatabase is incredibly simple to achieve. You just create and use multiple connections. And creating a new connection is a one-liner.

This brings us to Best Practice #1 :

  • Be mindful of read-write transactions & expensive read-only transactions
  • Perform such transactions on dedicated connections

And speaking of read-write transactions...

  • A database can only perform a single read-write transaction at a time.

This is an inherit limitation of sqlite. And it means that even if you have multiple YapDatabaseConnection's, all your readWrite transactions will execute in a serial fashion.

Recall that each YapDatabaseConnection has a serial queue, and that all transactions on that connection go through the connection's serial queue. In a similar fashion, YapDatabase has a serial queue for read-write transactions, and all read-write transactions (regardless of connection) must go through this "global" serial queue.

Which brings us to Best Practice #2 :

  • Never use a read-write transaction if a read-only transaction will do.

A read-only transaction is more lightweight than a read-write transaction. Plus read-only transactions increase concurrency.

The great thing about a read-only transaction on connectionA is that it can execute in parallel with a read-write transaction on connectionB.

And this means that you can easily avoid blocking your main thread.

Which brings us to Best Practice #3 :

  • Use a dedicated connection for the main thread
  • Do not use this connection anywhere but on your main thread
  • Do not execute any readWrite transactions with this connection
  • Only execute read-only transactions with this connection
  • Create separate YapDatabaseConnection(s) for background operations
  • Use these separate connections to do your readWrite transactions

The rationale behind this last "best practice" should be understandable. You don't want to block the UI thread. So you have a dedicated read-only connection for it. Which means that it only executes read-only transactions. Which means it won't ever block due to "expensive" read-write transactions.

Now having a read-only connection means you're going to need a way to notify the main thread when changes have occurred that require updating UI components (such as a UITableView).

Which brings us to Best Practice #4 :

Follow these best practices and you'll enjoy just how simple and powerful YapDatabase can be.


### Can I use KVO on objects?

Key-Value Observing in a database system is dependent upon several things. First, the objects that you fetch from the database must be mutable. Second, the mutable objects would need to be automatically updated by something. As in, changes to objects made on other threads/connections must get merged into the objects you already have in your hand (the objects you've already fetched from the database) on your thread. In order for this to happen:

  • objects must be tied to a specific connection (so the connection knows what objects to update)
  • objects must be mutable (so the connection can update them)

In order to satisfy these conditions we wind up going down the same road that has made Core Data such a pain. That is, our objects become non-thread-safe, and tied to a specific connection. Furthermore, it becomes mandatory (not optional) to use KVO as our objects might get changed underneath us at any time.

In addition to this, all objects that get stored in the database would need to support some kind of merge operation. At first this might seem feasible. But the feasibility comes into question when we realize YapDatabase can store basic objects such as NSString's, NSNumber's, etc. And this is why Core Data requires "object wrappers" for everything. Even if you just want to store a simple string in the database, it has to be wrapped in some NSManagedObject wrapper.

The fundamental architecture and philosophy behind YapDatabase is radically different from Core Data.

YapDatabase:

  • Key/value oriented with extensions
  • Connections are thread safe
  • Fetched objects are "bare" objects
  • Straightforward concurrency

Core Data

  • Object & relationship oriented
  • NSManagedObjectContext is not thread safe
  • Fetched objects are subclasses of NSManagedObject wrapper class
  • Fetched objects are tied to a specific context and are thus not thread-safe
  • Each context monitors and manages every fetched object
  • Concurrency requires manual merges and conflict resolution

Long story short, pure KVO observing is not supported by YapDatabase. Doing so would require us to make concessions that would defeat the original purpose of the project. However, YapDatabase does support a method of observing changes to specific keys / objects.

You can register for the YapDatabaseModifiedNotification. When you receive notification(s), simply pass the notification object to the connection to see if any particular keys (which you may be "observing") have changed.

See the above linked wiki page for some code samples.

From YapDatabaseConnection.h :

// Query for any change to a collection

- (BOOL)hasChangeForCollection:(NSString *)collection inNotifications:(NSArray *)notifications;
- (BOOL)hasObjectChangeForCollection:(NSString *)collection inNotifications:(NSArray *)notifications;
- (BOOL)hasMetadataChangeForCollection:(NSString *)collection inNotifications:(NSArray *)notifications;

// Query for a change to a particular key/collection tuple

- (BOOL)hasChangeForKey:(NSString *)key
           inCollection:(NSString *)collection
        inNotifications:(NSArray *)notifications;

- (BOOL)hasObjectChangeForKey:(NSString *)key
                 inCollection:(NSString *)collection
              inNotifications:(NSArray *)notifications;

- (BOOL)hasMetadataChangeForKey:(NSString *)key
                   inCollection:(NSString *)collection
                inNotifications:(NSArray *)notifications;

// Query for a change to a particular set of keys in a collection

- (BOOL)hasChangeForAnyKeys:(NSSet *)keys
               inCollection:(NSString *)collection
            inNotifications:(NSArray *)notifications;

- (BOOL)hasObjectChangeForAnyKeys:(NSSet *)keys
                     inCollection:(NSString *)collection
                  inNotifications:(NSArray *)notifications;

- (BOOL)hasMetadataChangeForAnyKeys:(NSSet *)keys
                       inCollection:(NSString *)collection
                    inNotifications:(NSArray *)notifications;

### Why SQLite?

It's a valid question worth pondering. For a key/value store there are many other possible underlying databases. So its worth reviewing the reasons why sqlite was chosen as the underlying datastore.

Reason 1: Purpose

Most often this "question" is posed with teeth. Something along the lines of:

I read that database X is 5% faster than sqlite according to tailored benchmarks I read on the website for database X. This proves that sqlite is a dead technology. Therefore, you suck! And YapDatabase sucks!

It's important to understand why there are so many databases in the world. It's because there are so many different scenarios for using a database. There are client applications. And server applications. And server applications that run in parallel on a cluster of thousands of servers. And we can break down these domains much further. Within a client application, do you need concurrency? What kind and size of data are you storing? Is your database acting as the primary backing store for the user interface?

For example, there may be a key/value database that only supports storing strings (for both key & value). The strings can only be up to a certain length. It supports concurrent readers. But only a single writer, and the writer blocks readers while its operating. However, despite these restrictions, this database absolutely screams. The performance is just insanely awesome. So ... would this database meet your requirements? The answer may be YES. And if that's the case, I'd be the first to tell you that you should go use that database instead.

Also, most applications have multiple sets of requirements. If an app needs to store 10 different items in a datastore, then it may have 10 different unique sets of requirements. Who told you that you need to use a single datastore to handle everything? (There is no "one database to rule them all!")

So what is YapDatabase great at that other databases are not-so-great at?

YapDatabase is great for making apps. Client side apps. For Mac & iOS. It's designed to help you deal with tableViews & collectionViews. It knows the main thread is for UI, and helps you avoid blocking it. It has straight-forward concurrency, and built-in caching. It can give you a long-lived read-only connection for the main-thread, and pin it to a certain commit. It allows you to move that connection to newer commits in an atomic fashion. And when you do, it will tell you exactly what changed, and how that affects your user interface.

Simply put, YapDatabase was designed with modern client side apps in mind. Concurrency was not an afterthought - it was baked into the original design. Cloud sync was not an afterthought - it requires concurrency and extensibility. The ability to drive the UI was not an afterthought - it's why YapDatabase comes with views & full text search & secondary indexes.

And that brings us to the next reason.

Reason 2: Extensibility

Perhaps we could swap out sqlite for some other key/value database. And perhaps you'd get a 4% performance improvement. But would you rather have that, or an extension to YapDatabase that provides Full Text Search? Or an extension that provides support for your favorite cloud sync service? Or an extension for persistent views? Or secondary indexes? Or an extension for R*Trees that provides efficient geospatial queries? Or the ability to write your own extension that has access to the full power of SQLite underneath it?

A minor performance improvement is a theoretical question of little importance most of the time. If you're already using a key/value database then your bottleneck isn't likely something that can be solved with another key/value database that's barely faster. You need something else entirely. You need an efficient way to sort your data for presentation in a table view. Or a secondary index on a particular property to speed up an important query. Or full text search. Etc.

And this is why I think SQLite is a great fit. YapDatabase provides simplicity up front. But the power of YapDatabase is not in the key/value store, but in the extensions it provides atop this base layer. And these extensions have access to the power and flexibility of sqlite under the hood.

For more information on extensions, see the Extensions wiki article.

Reason 3: Dependability

SQLite has been around for over a decade. It's used almost everywhere. It's even what Apple uses under the hood of Core Data. You don't get to this level of ubiquity without being dependable.

Reason 4: Availability

SQLite is on all versions of Mac OS X and iOS (at least for as far back as I can remember). So there are no third party C++ libraries to download and compile. There are no dependency errors, or makefiles to tweak. It's been part of the system for a long long time.

Reason 5: It's Free

And that's a tough price to beat.

Long story short, sqlite was chosen because it's the best tool for the job. And the job is making apps. And everything that goes along with it.


### Should I store images in YapDatabase?

First, to answer an alternative question: "Can I store images in YapDatabase?"

And the answer is Yes, you certainly can. Both UIImage & NSImage support NSCoding. Thus you can store images:

  • directly [transaction setObject:myUIImage forKey:key inCollection:collection]
  • within another object [transaction setObject:userObjectWithAvatarImageInside forKey:key inCollection:collection]
  • or by converting to JPEG/PNG, wrapping that in NSData, and then storing NSData [transaction setObject:jpegInsideNSData forKey:key inCollection:collection]

But just because something can be done doesn't imply you should do it. So let's discuss performance, and alternative options.

First, there is a difference between storing an image by itself, and storing it within another object. For example, say you have a User object. And every user has an associated avatar (which is an image). Should you store the image directly within the user object?

Well, if you do so, then every time you fetch a User object you're also fetching the image. So if you're fetching a User from disk that's normally only 2K worth of bytes, you're instead fetching 22K worth of bytes because of the image. Do you want/need the avatar every time you fetch a User object?

On top of this, fetching a User object will now also put the User+Image in the cache (as the image is a property of the user). Meaning that enumerating a bunch of users (for non-avatar purposes) will inflate the size (in bytes) of your cache.

Thus it is generally recommended that, if you're going to store images in the database, you store them separately from their associated object. Continuing with the User example, one could do something like this:

@property MyUser : NSObject <NSCoding>
// ...
@property (nonatomic) NSString* avatarKey;
@end

And then you can always fetch the avatar if/when you want it:

UIImage *avatar = [transaction objectForKey:user.avatarKey inCollection:@"avatars"];

Further, you can use the YapDatabaseRelationship extension to ensure that the avatar is automatically deleted from the database if you ever delete the associated user.

But is it faster to store my images in the database or directly on disk ?

Let's look at some numbers: https://www.sqlite.org/intern-v-extern-blob.html

On Apple systems, the default page_size is 4096. (And the page_size is configureable via YapDatabaseOptions, with caveats you can read about in the header file.) Which means, according to the chart, it's actually faster to store small images in a sqlite database. The break-even point is somewhere between 50K & 100K (according to the chart).

Important: The referenced test was done on a Linux workstation, using an Ext4 filesystem, with a SATA disk. Do you really think the numbers are going to be the same on an Apple system, using an HFS+ filesystem, with a flash disk? ... So if you're looking for a "hand-wavy rule of thumb", then saying the performance for "small" images is faster with sqlite is probably fine. But if you're serious about this particular performance optimization, then I'd strongly encourage you to run your own benchmarks on target systems.

So if your images are big, it would be preferable to do something like this:

@property MyUser : NSObject <NSCoding>
// ...
@property (nonatomic) NSString* avatarFilePath;
@end

And again, you can use the YapDatabaseRelationship extension to ensure that the avatar file automatically deleted from the filesystem if you ever delete the associated user. (Yes, YapDatabaseRelationship supports creating a relationship between an object in the database and a file on disk.)

There is one last thing that is possibly worth mentioning. From Apple's docs (for UIImage):

"In low-memory situations, image data may be purged from a UIImage object to free up memory on the system. This purging behavior affects only the image data stored internally by the UIImage object and not the object itself. When you attempt to draw an image whose data has been purged, the image object automatically reloads the data from its original file."

This auto-purging technique will only work for images loaded directly from a file on disk (not from an image in the database). This is because, if you load an image from the database, you're going to essentially be creating an image from data-in-memory. Which forces UIImage to retain its image data in low-memory situations, as it has no direct filesystem path to reload from.

Does this affect you? I'm not entirely sure, and it's a tough question to answer. Perhaps if you load a LOT of images from the database. And your app uses up a lot of memory by displaying many many small images at the same time. And your app is deeply deeply nested, where dozens of view controllers may be hidden in something like a navigation stack. Then perhaps, in this situation, it may be beneficial to allow the OS to automatically purge image data from all those images that are hidden in view controllers that are 6+ layers back. Maybe? This one is rather app-specific.

Clone this wiki locally