Skip to content

Reverse Engineering SSNS Format

Jean-Rémy Bancel edited this page Mar 28, 2018 · 1 revision

Why ?

At least the following list of file uses this format

  • Current Session
  • Current Tabs
  • Last Session
  • Last Tabs

These files stores everything related to tabs and session state. It is a very rich source of information for forensic investigations. It seems that no open source software is available to parse this format, so I started working on it.

Architecture

A SSNS file is a Header and a list of SessionCommand.

Header

It is defined in chrome/browser/sessions/session_backend.cc.

struct FileHeader {
    int32 signature;
    int32 version;
};

signature is a magic number used to identify the format.

static const int32 kFileSignature = 0x53534E53;

version is the version number of the format used for compatibility reasons.

static const int32 kFileCurrentVersion = 1;

Session Command

It is described in chrome/browser/sessions/session_command.h.

// SessionCommand contains a command id and arbitrary chunk of data. The id 
// and chunk of data are specific to the service creating them. 
+------------+--------------+
| Command Id |    Content   |
+------------+--------------+
 <= 8 bits => <= Variable =>
  • Command Id is uint8
  • Content can be raw data (mmap of a C structure) or a Pickle (serialized data)

Storage

Of course, when the file is read the size of each SessionCommand is needed. That is why the size of the command is written just before the command in the file on 16 bits (uint16).

+--------------+------------+--------------+
| Command Size | Command Id |    Content   |
+--------------+------------+--------------+
 <= 16 bits  => <= 8 bits => <= Variable =>

Given these informations it is trivial to extract a list a commands from a SNSS file. The hardest part is to give a meaning to these commands.

Parsing SessionCommand

As said in the description SessionCommand structure is a general purpose structure : its content can be anything. Nevertheless, there are two main categories of content :

  • The content mapped from a C structure
  • A Pickle (serialized object)

Content mapped from a C structure

You need to identify the structure in the code associated with the Command Id. For exemple a CommandTabClosed uses the ClosedPayload structure defined as follow in chrome/browser/sessions/session_service.cc:

struct ClosedPayload {
   SessionID::id_type id;
   int64 close_time;
};
struct IDAndIndexPayload {
   SessionID::id_type id;
   int32 index;
};

SessionID::id_type is defined in chrome/browser/sessions/session_command.h:

typedef uint8 id_type;

There are plenty of such structure defined in chrome/browser/sessions/session_service.cc.

Alignment

It took some time to realize that because of memory alignment a structure like IDAndIndexPayload takes 8Bytes and not 6Bytes.

A strange thing is the following structures being mapped on 8Bytes and not 4Bytes as I would have expected it. Maybe I missed something.

SessionID::id_type payload[] = { window_id.id(), tab_id.id() };
struct PinnedStatePayload {
   SessionID::id_type tab_id;
   bool pinned_state;
};

Content as a Pickle

Everything needed to understand how pickle object are designed is in base/pickle.*.

Header

struct Header {
    uint32 payload_size;
};

The payload size is defined in the header in order to have a customizable header size. Indeed, the payload offset in raw data is pickle_size - payload_size.

Data

Every basic type can be written in a Pickle : Boolean, Int, String, etc... Simple types are directly written and Strings are preceded by their size. A pickle storing an Int, a String and a Boolean looks like that :

+-----+-------------+--------+------+
| Int | String Size | String | Bool |
+-----+-------------+--------+------+

Every object smaller than 32 bits is written on 32 bits. It has to be taken into account while reading (reading uint8, uint16 for example).

CommandUpdateTabNavigation

This command uses a pickle object. I have implemented a basic pickle parser. It half works : I can read correctly two integers and a string. Then there is a String16. It is a String where characters are stored on 16bits. I am not sure if it is like utf-16. Decoding as utf-16 using Python works most of the time...

The current problem is that the number giving the size of the string is sometimes obviously wrong : for example the size is bigger than the size of the Pickle payload itself... I don't know where it comes from.

For the moment I am able to retrieve :

  • Tab ID
  • Index
  • Url in the tab
  • Sometimes the title

Here is an example from my Current Session file.

dataSize: 1258, payloadSize: 1254, payloadStart: 4
Tab Id: 150
Index: 3
Url: http://www.freebsd.org/cgi/cvsweb.cgi/ports/www/py-mechanize/
Title: None
----------------------------
dataSize: 6513, payloadSize: 6509, payloadStart: 4
Tab Id: 206
Index: 2
Url: http://code.activestate.com/recipes/410662-a-function-to-check-if-a-number-is-prime/
Title: A function to check if a number is prime « Python recipes « ActiveState Code
----------------------------
dataSize: 6604, payloadSize: 6600, payloadStart: 4
Tab Id: 206
Index: 3
Url: http://www.programme-tv.net/
Title: Votre programme TV avec Télé Loisirs : le programme télévision grandes chaînes, TNT et câble
----------------------------
dataSize: 11528, payloadSize: 11524, payloadStart: 4
Tab Id: 206
Index: 4
Url: http://www.programme-tv.net/programme/programme-tnt.html
Title: Programme TNT : les 18 chaînes du programme TV TNT

My current goal is to understand why the size of the second string is wrong.

Test

You can test the current state of the project in the SNSS branch :

python chromagnonSession.py ~/.config/chromium/Default/Current\ Session

python chromagnonTab.py ~/.config/chromium/Default/Current\ Tabs