forked from iloreen/libsumo
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.dev
161 lines (118 loc) · 6.23 KB
/
README.dev
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
Protocol
--------
My understanding so far: the protocol to control the Jumping Sumo is
mainly based on UDP-messages. In the beginning there is a small
configuration exchange using TCP (and JSON). This connection is might
also be also used for configuring the Sumo.
DNS-SD/MDNS - discovery
-----------------------
The very first thing. The Freeflight 3 app makes the device join a
multicast group and does a MDNS discover to find the Sumo IP address
and entry service: It looks for who is providing the '_arsdk-0902' on
a '._udp.local'-domain .
Sometimes it gets a response, sometimes not. This "sometimes" is, I
think, the root-problem of the FreeFlight3 App of sometimes not
finding the Sumo on some Phones/Tablets.
The response contains the three DNS-types of information: 'TXT',
'SRV', 'A'. TXT gives us the complete IP-name of the device:
'JumpingSumo-<serial-no>._arsdk-0902._udp.local'. 'SRV' tells us
that there is a service "JumpingSumo-<serial-no>" at port 44444,
targetting "JumpingSumo-<serial-no>.local" . 'A' provides us with
IP-Address for "JumpingSumo-<serial-no>.local"
TCP connection to 44444
-----------------------
With this information the controller establishes a TCP-connection to
"JumpingSumo-<serial-no>.local" on port 44444 doing exactly one
exchange:
We send JSON-formatted string:
{ "controller_name" : "<device name>", "controller_type": "<control name>", "d2c_port": 54321 }
Here the important part is the 'd2c_port'-field which tells the device (Sumo)
to which UDP-port of the controller (PC or SmartPhone) it has to
send the network-packets.
The device answers with a JSON-formatted string:
{ "status": 0, "c2d_port": 54321, "arstream_fragment_size": 65000, "arstream_fragment_maximum_number": 4,
"c2d_update_port": 51, "c2d_user_port": 21 }
Basically we see here the 'c2d_port' which tells us where to send our
outgoing UDP-packets. The other fields are yet to be exploited.
Touching user_port or update_port seems undarable if you do not want
to break the device.
Once this is done the TCP-connection is closed and from now on only
UDP-packets are sent.
UDP-packets
-----------
For details about the packet structures see lib/protocol.h. Some of
the logic can only be found in the lib/*.cpp files which are using
these structures.
In the packet-structures some of the fields are called 'unk' or
'unknown'. This indicates that I wasn't able to figure out their
usage. Basically they have been constant in all analyzed communication.
I names the 4 main packet-types: SYNC, ACK, IOCTL and IMAGE.
SYNC is bi-directional and we see timestamps getting in, being
confirmed from the controller and vice-versa. Additionally the
controller senda the move and turn-command using this type of packet
in regular manner.
IMAGE is an incoming-only packet and carries the latest camera image
of the Sumo in JPEG-format and a total frame_number-index.
Images are received in a VGA resolution, so what looks like a video is
in fact a sequence of JPEGs. At first I was stunned as to why to do it
like this considering the bandwidth used. But think of the possibility
of lost UDP-packets: you cannot work with any advanced video-codec as
losing one frame can kill the stream for several seconds. Whereas
losing one JPEG might even go unnoticed by the user.
The frame-rate is about 15 fps.
IOCTL is used to signal several control commands, including basic
information gathered from the device, enabling of video-streaming,
special moves and jumps and so on. It is again bi-directional and the
device is sending for example the battery-level using this
packet-type.
ACK is used to acknowledge the IOCTL command sequence number. Done by
the device and by the controller. If the ACK-packet is not send within
100ms the device will send it again up to 3 times.
LibSumo
-------
The code which can be found in lib/ is the implemention of a library
intended to be run on the controller-side.
It is written in C++11 and uses some of the new feature introduced in
C++, like threads and mutexes.
The following block-diagram descibes the structure and the role of
each thread in detail:
Everything the user needs to create is an object of the class
Control. Calling the method open() will establish a connection to
the Sumo and initialize it until it is ready to be used. close() will
release the connection.
Calling open() will also create a bunch of threads inside the
controller object. Each thread has a distinctive role regarding the
communication with the Sumo:
There is the Dispatcher In-, Control In-, ReatTime In- and Out- and the
Image-thread, and of course the main or user-thread (Controller Out).
The threads are communicating with each other using message-queues.
The Control OUT-thread is the one sending IOCTL-messages to the
device expressing action which have been requested by an user action.
The Dispatcher IN-thread reads data from the UDP-inbound
connection. After parsing the message-header it will dispatch the
message into the message-queue for the appropriate thread.
These two threads are implemented in the Control-class.
Control IN is implemented next to the Control-class in a private class
manner. It handles all incoming IOCTL-messages such as battery-level
or other information send by the Sumo (SerialNumber) and provides them
via the API to the user. It signals as well certain states regarding
incoming IOCTL-message so that the protocol-logic is kept synchronous.
The RealTime IN and OUT thread are handling the in- and outgoing
nanosecond-based heartbeat timestamp send every "now and then" by the
controller and the device. In addition the outgoing traffic of this
type is used to send move and turn instructions. I haven't yet figured
out the real reason for the precise heartbeat action. I have seen
videos provided by Parrot where one can observe multiple Sumos doing
synchronized actions. Maybe it is used for that.
One thread is dedicated to the image-reception.
API
---
Have a look into control.h. The public methods are the action I was
able to figure out. I even found actions which where, at the time of
writing this, not available in FreeFlight3.
Debug and decode
----------------
In the library there is also a collection of tool I used during
reverse-engineering. Especially the decode.cpp-tools can be used to
analyze a network stream and also to debug what you are actually
sending.