Skip to content

Commit f152369

Browse files
committed
Worked on Spotlight store format support
1 parent 27d1a8f commit f152369

7 files changed

+1068
-251
lines changed

documentation/Apple Spotlight store database file format.asciidoc

+206-29
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
= Apple Spotlight store database file format
1+
= Apple Spotlight store file formats
22

33
:toc:
44
:toclevels: 4
@@ -7,29 +7,29 @@
77
[abstract]
88
== Summary
99

10-
The Apple Spotlight store database file format is used by MacOS and iOS to
11-
store the Spotlight desktop search information. This specification is based
12-
on the source code and documentation.
10+
The Apple Spotlight store file formats are used by MacOS and iOS to store
11+
the Spotlight desktop search information. This specification is based
12+
on available source code and documentation.
1313

1414
This document is intended as a working document for the Apple Spotlight
15-
store database file format specification.
15+
store file formats specification.
1616

1717
[preface]
1818
== Document information
1919

2020
[cols="1,5"]
2121
|===
2222
| Author(s): | Joachim Metz <[email protected]>
23-
| Abstract: | Apple Spotlight store database file format
23+
| Abstract: | Apple Spotlight store file formats
2424
| Classification: | Public
25-
| Keywords: | Apple Spotlight store database, store.db
25+
| Keywords: | Apple Spotlight store, store.db
2626
|===
2727

2828
[preface]
2929
== License
3030

3131
....
32-
Copyright (C) 2020, Joachim Metz <[email protected]>.
32+
Copyright (C) 2020-2023, Joachim Metz <[email protected]>.
3333
Permission is granted to copy, distribute and/or modify this document under the
3434
terms of the GNU Free Documentation License, Version 1.3 or any later version
3535
published by the Free Software Foundation; with no Invariant Sections, no
@@ -44,28 +44,69 @@ in the section entitled "GNU Free Documentation License".
4444
|===
4545
| Version | Author | Date | Comments
4646
| 0.0.1 | J.B. Metz | June 2020 | Initial version based on earlier notes, with thanks to Everest Munro-Zeisberger
47+
| 0.0.2 | J.B. Metz | June 2023 | Additional changes based on format analysis
4748
|===
4849

4950
:numbered:
5051
== Overview
5152

52-
The Apple Spotlight store database file format is used by MacOS and iOS to
53-
store the Spotlight desktop search information. This specification is based
54-
on the source code and documentation.
53+
The Apple Spotlight store file formats are used by MacOS and iOS to store
54+
the Spotlight desktop search information. This specification is based on
55+
available source code and documentation.
56+
57+
The Spotlight store is typically stored under:
58+
59+
....
60+
/.Spotlight-V100/Store-V2/${UUID}/
61+
....
62+
63+
But can also be stored in other locations such as:
5564

5665
....
57-
/.Spotlight-V100/Store-V2/${STORE_UUID}/store.db
58-
/.Spotlight-V100/Store-V2/${STORE_UUID}/.store.db
66+
/Users/${USERNAME}/Library/Caches/com.apple.helpd/index.spotlightV3/
67+
/Users/${USERNAME}/Library/Metadata/CoreSpotlight/index.spotlightV3/
68+
/Users/${USERNAME}/Library/Developer/Xcode/DocumentationCache/v18/9.0.1/DeveloperDocumentation.index/
5969
....
6070

71+
The Spotlight store consists of the following files:
72+
73+
* .store.db
74+
* store.db
75+
76+
In macOS 10.15 (Catalina) the following <<database_streams_map_files,database streams map files>>
77+
were added:
78+
79+
* dbStr-#.map.buckets
80+
* dbStr-#.map.data
81+
* dbStr-#.map.header
82+
* dbStr-#.map.offsets
83+
6184
[cols="1,5",options="header"]
6285
|===
6386
| Characteristics | Description
6487
| Byte order | little-endian
65-
| Date and time values |
88+
| Date and time values | Cocoa and POSIX date and time values
6689
| Character strings | UTF-8 formatted strings
6790
|===
6891

92+
=== Test versions
93+
94+
The following version of programs were used to test the information within this
95+
document:
96+
97+
* Mac OS X 10.7 (Lion)
98+
* Mac OS X 10.8 (Mountain Lion)
99+
* Mac OS X 10.9 (Mavericks)
100+
* Mac OS X 10.10 (Yosemite)
101+
* Mac OS X 10.11 (El Capitan)
102+
* macOS 10.12 (Sierra)
103+
* macOS 10.13 (High Sierra)
104+
* macOS 10.14 (Mojave)
105+
* macOS 10.15 (Catalina)
106+
* macOS 11 (Big Sur)
107+
* macOS 12 (Monterey)
108+
* macOS 13 (Ventura)
109+
69110
== Store database file
70111

71112
A store database (store.db) file consists of:
@@ -85,7 +126,8 @@ The file header is at least 720 bytes in size and consists of:
85126
|===
86127
| Offset | Size | Value | Description
87128
| 0 | 4 | "8tsd" | Signature
88-
| 4 | 4 | | Flags
129+
| 4 | 4 | | Flags +
130+
See section: <<file_header_flags,File header flags>>
89131
| 8 | 4 | | [yellow-background]*Unknown (0-byte values)*
90132
| 12 | 4 | | [yellow-background]*Unknown* +
91133
Seen: 0x0c
@@ -113,6 +155,24 @@ The file header is stored in the first 4096 bytes
113155
The signature is "dst8" in little-endian, which cloud represention something
114156
in line of "data store".
115157

158+
==== [[file_header_flags]]File header flags
159+
160+
[cols="1,1,5",options="header"]
161+
|===
162+
| Value | Identifier | Description
163+
| 0x00000001 | | Seen in .store.db and store.db
164+
| | |
165+
| 0x00000004 | | Seen in .store.db and store.db
166+
| 0x00000008 | | Seen in .store.db
167+
| | |
168+
| 0x00000100 | | Seen in .store.db and store.db
169+
| | |
170+
| 0x00000400 | | Seen in .store.db
171+
| 0x00000800 | | Seen in .store.db and store.db
172+
| | |
173+
| 0x00010000 | | [yellow-background]*Unknown (Has database streams map files)?*
174+
|===
175+
116176
=== Pages
117177

118178
....
@@ -161,7 +221,7 @@ The map page value is 16 bytes in size and consists of:
161221
00001050 00 40 00 00 0e d2 01 00 00 00 00 00 51 00 00 00 |[email protected]...|
162222
....
163223

164-
TODO what about "1mbd"
224+
[yellow-background]*TODO what about "1mbd"*
165225

166226
=== Property table
167227

@@ -201,8 +261,9 @@ Page contains a property table header
201261
Page contains a property table header
202262
| 0x00000081 | | Metadata lists or localized strings +
203263
Page contains a property table header
204-
| 0x00001009 | | data records +
205-
Page contains LZ4 compressed data
264+
3+| _Flags_
265+
| 0x00001000 | | Data is LZ4 compressed
266+
| 0x00004000 | | [yellow-background]*Unknown*
206267
|===
207268

208269
==== Compressed data
@@ -211,22 +272,35 @@ Page contains LZ4 compressed data
211272
|===
212273
| Value | Identifier | Description
213274
| "\x78" | | start of zlib+DEFLATE compressed data
214-
| "bv41" | | start of LZ4 compressed block
275+
| "bv41" | | LZ4 compressed block marker +
276+
See section: <<lz4_compressed_block,LZ4 compressed block>>
277+
| "bv4-" | | LZ4 uncompressed block marker +
278+
See section: <<lz4_uncompressed_block,LZ4 uncompressed block>>
279+
| "bv4$" | | end of LZ4 compressed stream marker
215280
|===
216281

217-
===== [[lz4_compressed_block]]LZ4 compressed block
282+
==== [[lz4_compressed_block]]LZ4 compressed block
218283

219284
[cols="1,1,1,5",options="header"]
220285
|===
221286
| Offset | Size | Value | Description
222287
4+| _LZ4 compressed block header_
223-
| 0 | 4 | "bv41" | Signature
288+
| 0 | 4 | "bv41" | LZ4 compressed block marker
224289
| 4 | 4 | | Uncompressed data size (in bytes)
225-
| 8 | 4 | | Block size (in bytes)
290+
| 8 | 4 | | LZ4 compressed data size (in bytes)
226291
4+| _LZ4 compressed block data_
227292
| 12 | ... | | LZ4 compressed data
228-
4+| _LZ4 compressed block footer_
229-
| ... | 4 | "bv4$" | | LZ4 end of compressed data marker
293+
|===
294+
295+
==== [[lz4_uncompressed_block]]LZ4 uncompressed block
296+
297+
[cols="1,1,1,5",options="header"]
298+
|===
299+
| Offset | Size | Value | Description
300+
| 0 | 4 | "bv4-" | LZ4 uncompressed block marker
301+
| 4 | 4 | | Uncompressed data size (in bytes)
302+
4+| _LZ4 uncompressed block data_
303+
| 8 | ... | | Uncompressed data
230304
|===
231305

232306
==== Property table header
@@ -252,7 +326,8 @@ The data record (type 0x09) is variable of size and consists of:
252326
4+| _Record data_
253327
| 4 | ... | | Identifier +
254328
A variable size integer that contains the file system identifier, e.g. CNID on HFS, of the corresponding file (system) entry
255-
| ... | 1 | | Flags
329+
| ... | 1 | | Data record flags +
330+
See section: <<data_record_flags,Data record flags>>
256331
| ... | ... | | Item identifier +
257332
Contains a variable size integer
258333
| ... | ... | | Parent identifier +
@@ -263,22 +338,21 @@ Contains a variable size integer that contains the number of microseconds since
263338
| ... | ... | | Properties array
264339
|===
265340

266-
TODO: describe flags
341+
==== [[data_record_flags]]Data record flags
267342

268343
[cols="1,1,5",options="header"]
269344
|===
270345
| Value | Identifier | Description
271346
| 0x01 | | [yellow-background]*Unknown (Is metadata?)* +
272-
Seen in record with identifier 0 *
273-
Does this influence the behavior of the value type of kMDStoreAccumulatedSizes ?
347+
Seen in record with identifier 0
274348
| 0x02 | |
275349
3+|
276350
| 0x10 | |
277351
| 0x20 | |
278352
| 0x40 | |
279353
|===
280354

281-
TODO: describe property
355+
[yellow-background]*TODO: describe property*
282356

283357
[cols="1,1,1,5",options="header"]
284358
|===
@@ -677,6 +751,109 @@ Data:
677751
| kMDItemWhereFroms | | |
678752
|===
679753

754+
== [[database_streams_map_files]]Database streams map (dbStr-#.map) files
755+
756+
The # in the filename corresponds to the nature of the strings in the map.
757+
758+
[cols="1,1,5",options="header"]
759+
|===
760+
| Value | Identifier | Description
761+
| 1 | | Metadata types streams map
762+
| 2 | | Metadata values streams map
763+
| 3 | | Unknown values 0x41 streams map
764+
| 4 | | Metadata lists streams map
765+
| 5 | | Metadata localized strings streams map
766+
|===
767+
768+
=== Database streams map header file (dbStr-#.map.header)
769+
770+
The database streams map header file (dbStr-#.map.header) file is 56 bytes in
771+
size and consists of:
772+
773+
[cols="1,1,1,5",options="header"]
774+
|===
775+
| Offset | Size | Value | Description
776+
| 0 | 8 | "\x00PataD\x00\x00" | Signature
777+
| 8 | 4 | | [yellow-background]*Unknown (Seen: 13)*
778+
| 12 | 4 | | [yellow-background]*Unknown (Seen: 0, 2)*
779+
| 16 | 4 | | [yellow-background]*Unknown (Seen: 1)*
780+
| 20 | 4 | | [yellow-background]*Unknown (size of the corresponding dbStr-#.map.data file?)*
781+
| 24 | 4 | | [yellow-background]*Unknown (page/block size or flags?)*
782+
| 28 | 4 | | [yellow-background]*Unknown (number of entries in the corresponding dbStr-#.map.ofsets file?)*
783+
| 32 | 4 | | [yellow-background]*Unknown (similar to value at offset 20)*
784+
| 36 | 4 | | [yellow-background]*Unknown (similar to value at offset 24)*
785+
| 40 | 4 | | [yellow-background]*Unknown (similar to value at offset 28)*
786+
| 44 | 4 | 0 | [yellow-background]*Unknown (empty)*
787+
| 48 | 4 | 0 | [yellow-background]*Unknown (empty)*
788+
| 52 | 4 | 0 | [yellow-background]*Unknown (empty)*
789+
|===
790+
791+
=== Database streams map offsets file (dbStr-#.map.offsets)
792+
793+
The database streams map offsets file (dbStr-#.map.offsets) file is 4096 bytes
794+
in size and consists of:
795+
796+
[cols="1,1,1,5",options="header"]
797+
|===
798+
| Offset | Size | Value | Description
799+
| 0 | 4 x number of entries | | Array of 32-bit offsets +
800+
The offset of the value in the corresponding dbStr-#.map.data file
801+
| ... | ... | 0 | [yellow-background]*Unknown (empty)*
802+
|===
803+
804+
=== Database streams map data file (dbStr-#.map.data)
805+
806+
The database streams map data file (dbStr-#.map.data) is variable of size and
807+
consists of:
808+
809+
* One or more stream values
810+
811+
[NOTE]
812+
Note that the first stream value always appears to be a single 0-byte value.
813+
814+
A stream value is variable of size and consists of:
815+
816+
[cols="1,1,1,5",options="header"]
817+
|===
818+
| Offset | Size | Value | Description
819+
| 0 | ... | | Stream value size
820+
Contains a variable size integer +
821+
See section: <<variable_size_integer,Variable size integer>> +
822+
| ... | Stream value size | | Stream value
823+
|===
824+
825+
==== Metadata attribute types
826+
827+
The dbStr-1.map.data file contains metadata attribute types that consist of:
828+
829+
[cols="1,1,1,5",options="header"]
830+
|===
831+
| Offset | Size | Value | Description
832+
| 0 | 1 | | Value type +
833+
See section: <<metadata_attribute_value_types,Metadata attribute value types>>
834+
| 1 | 1 | | Property type
835+
| 2 | ... | | Key name +
836+
Contains an UTF-8 encoded string with an end-of-string character
837+
|===
838+
839+
==== Metadata attribute values
840+
841+
The dbStr-2.map.data file contains metadata attribute values that consist of:
842+
843+
[cols="1,1,1,5",options="header"]
844+
|===
845+
| Offset | Size | Value | Description
846+
| 0 | ... | | Metadata attribute value name +
847+
Contains an UTF-8 encoded string with an end-of-string character
848+
|===
849+
850+
=== Database streams map buckets file (dbStr-#.map.buckets)
851+
852+
The database streams map buckets file (dbStr-#.map.buckets) file is 4096 bytes
853+
in size and consists of:
854+
855+
[yellow-background]*TODO: describe*
856+
680857
== Notes
681858

682859
....

0 commit comments

Comments
 (0)