Exploring The Halo 1 System Link Protocol
September 18, 2023
One of my all-time favorite games is Halo: Combat Evolved that launched together with the original Xbox in 2001. One major part of the game was multiplayer, up to 16 people at a LAN party could play together by connecting four Xbox consoles over Ethernet, referred to as System Link.
Out of pure curiosity, we will try to look at how the Halo game instances communicate with each other over System Link. In order to do that, we will set up System Link, play some Halo multiplayer, and see if we can reverse engineer the network protocol by inspecting the messages that are sent between the consoles. Should be easy enough, right?
Table of contents
System Link setup
I still have a fully functional unmodified original Xbox console. We will connect it to a computer using Ethernet cables via a network switch and run an Xbox emulator on the computer that will play together with the console.
We will use the xemu emulator, which is based on the excellent QEMU project. The Xbox contains a more or less off-the-shelf Intel Pentium III, a 32-bit x86 CPU. If interested, you may find a good introduction to the Xbox architecture here.
In order to run Halo with xemu, we will need some files, in particular:
- a 6.9 GiB game disc, containing the game executable (XBE) and all assets,
- an HDD image in QEMU’s qcow2 format, containing a dashboard that launches the game,
- a 256 KiB BIOS ROM (actually 1024 KiB, replicated 4 times), containing the BIOS, bootloader and the Xbox kernel,
- a 512 B MCPX ROM which does some configurations at boot and then decrypts and runs the bootloader,
- a 256 B EEPROM, with some device-specific information, e.g. the MAC address.
The xemu project provides pre-formatted HDD images that contain a dummy dashboard that we will use. Xemu will also automatically set up an EEPROM for us. For the MCPX ROM, we will use an open source reimplementation. For the game disc and BIOS ROM, we will have to use dumped images.
We will use PAL region copies of the game, which differ from the ones in NTSC regions. For example, the original NTSC version runs at 30 frames per second while the PAL version runs at 25 frames per second1.
Now we boot up Halo on the Xbox console, create a System Link game and connect the console to our computer with some Ethernet cables. We then boot up the emulator, set up the above mentioned files, make sure it is set up to use the Ethernet network interface and then search for System Link games.
And voila! We find the game hosted by the console and can play some Halo over LAN.
Observing the traffic
While searching for games, packets should be sent between the instances
continuously, because as soon as the game is closed it is removed from the
list. We can then run e.g. tcpdump
and see if we can detect any of these
packets over our network interface.
$ tcpdump
listening on enp9s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes
14:42:58.791069 IP 0.0.0.1.xbox > 255.255.255.255.xbox: UDP, length 60
14:42:58.801698 IP 0.0.0.1.xbox > 255.255.255.255.xbox: UDP, length 324
14:42:58.919744 IP erlite.domain > igelkott.49337: 31319* 1/0/0 PTR igelkott. (65)
14:43:00.531704 ARP, Request who-has narwhal tell igelkott, length 28
14:43:00.808959 IP 0.0.0.1.xbox > 255.255.255.255.xbox: UDP, length 60
14:43:00.810704 IP 0.0.0.1.xbox > 255.255.255.255.xbox: UDP, length 324
Besides some random traffic on my LAN, we can see that there are some
UDP packets that go via a port that tcpdump
has actually labeled as
“xbox”2. We can use the BPF expression udp port xbox
to
filter by this UDP port, and ignore all other traffic. In order to normalize
the timestamps to be offset from the first packet, we will add the -ttttt
option.
Interestingly, the Xbox consoles always use 0.0.0.1
as the source IP
address and they send broadcast messages to 255.255.255.255
. They
seem to address each other by MAC addresses only, we therefore add the -e
option to display the MAC addresses in order to know which of the two machines
each packet is sent from.
In order to improve readability, we will strip some redundant information and replace the MAC addresses of the machines by the names of the consoles used by the game. My console happens to be called “Howard” and is the host of the game while the emulator is called “Ghost” which is a guest that connects to the host’s game. Our output should then be easier to read:
00.000000 Ghost > ff:ff:ff:ff:ff:ff, len 60
00.015546 Howard > ff:ff:ff:ff:ff:ff, len 324
02.001024 Ghost > ff:ff:ff:ff:ff:ff, len 60
02.024528 Howard > ff:ff:ff:ff:ff:ff, len 324
04.040860 Ghost > ff:ff:ff:ff:ff:ff, len 60
04.074594 Howard > ff:ff:ff:ff:ff:ff, len 324
Now we can see that the guest searches for existing games with a 60-byte packet every 2 seconds, to which the game host quickly responds with a 324-byte packet. If we then join the game, we initially get a rapid series of exchanged packets:
00.000000 Ghost > Howard, len 164
00.013472 Howard > Ghost, len 164
00.023361 Ghost > Howard, len 48
00.023476 Howard > Ghost, len 48
00.024722 Ghost > Howard, len 44
00.025583 Ghost > Howard, len 140
00.096030 Howard > Ghost, len 60
00.096288 Howard > Ghost, len 1132
00.121319 Ghost > Howard, len 124
00.135767 Howard > Ghost, len 44
00.137273 Howard > Ghost, len 1132
00.155669 Ghost > Howard, len 44
00.160945 Ghost > Howard, len 92
00.161371 Ghost > Howard, len 316
00.178269 Howard > Ghost, len 1132
00.200977 Ghost > Howard, len 92
00.336793 Howard > Ghost, len 44
00.341981 Howard > Ghost, len 60
While we are connected to the pre-game lobby (and not pressing anything), a 316-byte packet with a 44-byte response is sent by the guest every second. The host also sends a 60-byte packet with a 44-byte response every 5 seconds:
00.000000 Ghost > Howard, len 316
00.064942 Howard > Ghost, len 44
00.426135 Howard > Ghost, len 60
00.599351 Ghost > Howard, len 44
01.040111 Ghost > Howard, len 316
01.068941 Howard > Ghost, len 44
02.080000 Ghost > Howard, len 316
02.274953 Howard > Ghost, len 44
03.119122 Ghost > Howard, len 316
03.278949 Howard > Ghost, len 44
04.159019 Ghost > Howard, len 316
04.283958 Howard > Ghost, len 44
05.199096 Ghost > Howard, len 316
05.287955 Howard > Ghost, len 44
05.428169 Howard > Ghost, len 60
When starting the game we get an initial exchange of packets:
00.000000 Howard > Ghost, len 60
00.022185 Ghost > Howard, len 44
00.263800 Ghost > Howard, len 316
00.265740 Howard > Ghost, len 44
00.655155 Howard > Ghost, len 1132
00.655155 Howard > Ghost, len 60
00.669542 Ghost > Howard, len 60
00.868764 Howard > Ghost, len 44
01.067857 Howard > Ghost, len 132
01.108929 Howard > Ghost, len 132
01.144205 Ghost > Howard, len 52
01.149912 Howard > Ghost, len 132
01.185236 Ghost > Howard, len 84
01.225189 Ghost > Howard, len 44
01.225248 Ghost > Howard, len 84 (x 13)
01.759222 Ghost > Howard, len 84
01.785473 Howard > Ghost, len 132 (x 14)
01.786040 Howard > Ghost, len 132
After that we get repeating cycles of packets again:
00.000000 Ghost > Howard, len 84
00.001678 Howard > Ghost, len 132
00.009963 Ghost > Howard, len 44
00.041000 Ghost > Howard, len 84
00.042677 Howard > Ghost, len 132
00.082002 Ghost > Howard, len 84
00.083680 Howard > Ghost, len 132
00.123003 Ghost > Howard, len 84
00.124683 Howard > Ghost, len 132
00.163999 Ghost > Howard, len 84
00.165695 Howard > Ghost, len 132
00.205000 Ghost > Howard, len 84
00.206679 Howard > Ghost, len 132
00.210963 Ghost > Howard, len 44
00.246005 Ghost > Howard, len 84
00.247677 Howard > Ghost, len 132
00.247715 Howard > Ghost, len 132
The most frequent packet is 84 bytes, it is sent by the guest every 41 ms and receives a 132-byte response. This is close to the 25 frames per second that PAL Halo runs at. Sometimes, another 132-byte packet is sent by the host immediately after the first response. Every 200 ms a 44-byte packet is also sent by the guest with no response.
So, some form of game data is sent between the guest and the host every frame of the game. Now we just need to inspect those packets and see if we can understand what information they contain, right?
We can supply the -X
flag to tcpdump
in order to show the packet data in
hexadecimal and ASCII. At first we listen during the game search as those
packets should be mostly static and perhaps contain the name of the host in
plain text. Let us try it out:
$ tcpdump -e -ttttt -X "udp port xbox"
00.000000 Ghost > ff:ff:ff:ff:ff:ff, len 60
0x0000: 4500 0058 13d1 0000 4011 66c4 0000 0001 E..X....@.f.....
0x0010: ffff ffff 0c02 0c02 0044 2ed5 ffff ffff .........D......
0x0020: ffff ffff 8246 0228 a44a 9087 dada fddd .....F.(.J......
0x0030: 8513 eeab 4048 fe18 d09a ec97 d161 ea37 ....@H.......a.7
0x0040: 7459 d31a 8633 ecc0 2cbc f43f 1635 4c17 tY...3..,..?.5L.
0x0050: 1466 f586 e72a ccdc .f...*..
00.005401 Howard > ff:ff:ff:ff:ff:ff, len 324
0x0000: 4500 0160 c9fb 0000 4011 af91 0000 0001 E..`....@.......
0x0010: ffff ffff 0c02 0c02 014c 888f ffff ffff .........L......
0x0020: ffff ffff a64c 9b44 9a6e 8a15 537d 10e2 .....L.D.n..S}..
0x0030: ccd0 8f84 8d12 f572 41c1 4d05 db1e b765 .......rA.M....e
0x0040: 8f47 fb39 6f32 c357 7ea6 a551 2c21 f632 .G.9o2.W~..Q,!.2
:
0x0150: 18a7 0f7c f970 b787 b371 6c60 1260 5825 ...|.p...ql`.`X%
02.040201 Ghost > ff:ff:ff:ff:ff:ff, len 60
0x0000: 4500 0058 14d1 0000 4011 65c4 0000 0001 E..X....@.e.....
0x0010: ffff ffff 0c02 0c02 0044 8e15 ffff ffff .........D......
0x0020: ffff ffff 64eb c15c 4dd1 58da 4160 2309 ....d..\M.X.A`#.
0x0030: c9a9 0425 87ef dcd7 48a0 cf1e 826f 71de ...%....H....oq.
0x0040: d626 8560 ec82 4199 cb8f 7ae3 fe4e 8a4d .&.`..A...z..N.M
0x0050: 5f8a 80c1 2fa9 80a3 _.../...
02.055399 Howard > ff:ff:ff:ff:ff:ff, len 324
0x0000: 4500 0160 cefb 0000 4011 aa91 0000 0001 E..`....@.......
0x0010: ffff ffff 0c02 0c02 014c 7062 ffff ffff .........Lpb....
0x0020: ffff ffff 7e37 59fa e1f3 0c03 618f f89c ....~7Y.....a...
0x0030: 9fb7 8a3d 7edf 4d1c c042 cb39 5c0a c8e0 ...=~.M..B.9\...
0x0040: e22d d069 1ed6 4cfc 555f 6412 ce2b e2f1 .-.i..L.U_d..+..
:
0x0150: b0df bc69 a6cc 6e82 99f9 e032 82cb 19e0 ...i..n....2....
Unfortunately, these packets do not contain any readable strings, nor is the content static between packets. The data shown by tcpdump shows the whole packet except for the Ethernet header, we can annotate one of the packets as per the IP and UDP RFCs:
/* IPv4 header */
00: 45 /* version = ipv4, header length = 5*4 = 0x14 bytes */
01: 00 /* type of service = routine */
02: 00 58 /* IP total length */
04: 13 d1 /* identification */
06: 00 00 /* flags, fragment offset */
08: 40 /* time to live */
09: 11 /* IP protocol = UDP */
0a: 66 c4 /* IP header checksum */
0c: 00 00 00 01 /* IP source address */
10: ff ff ff ff /* IP destination address */
/* UDP header */
14,00: 0c 02 /* UDP source port = 3074 */
16,02: 0c 02 /* UDP destination port = 3074 */
18,04: 00 44 /* UDP total length */
1a,06: 2e d5 /* checksum */
/* data */
1c,08,00: ff ff ff ff ff ff ff ff 82 46 02 28 a4 4a 90 87
2c,18,10: da da fd dd 85 13 ee ab 40 48 fe 18 d0 9a ec 97
3c,28,20: d1 61 ea 37 74 59 d3 1a 86 33 ec c0 2c bc f4 3f
4c,38,30: 16 35 4c 17 14 66 f5 86 e7 2a cc dc
The actual data always starts with eight 0xff
bytes for these packets, the
remaining looks completely random, presumably encrypted. How come the packets
are encrypted? Is the System Link traffic over LAN really that critical? Oh
well, now I am even more curious to see what it contains.
Watching from inside
Even though we are not able to read the contents of the packets from outside, the emulator must be able to read it as it has clearly managed to connect to and play with the Xbox console. If we were able to look inside the emulator’s memory, we should be able to inspect the decrypted packet data.
Like with QEMU, we can with xemu easily attach a debugger to the guest program.
We simply provide -gdb tcp::1234
when starting xemu to create a GDB
server that listens on the TCP port 1234. We can then connect to it by running
(gdb) target remote :1234
in a GDB console.
However, it is not entirely trivial to find the location of a message in memory just as it has been decrypted. But if we were to know where the decryption occurs in the program, we could place a breakpoint there and read the memory for the message right after it has been decrypted (or before it has been encrypted).
I suspect that the cryptography stack is not specific to the Halo game, probably all Xbox multiplayer games use similar type of encryption. It would then seem plausible that the crypto code is located in a shared place, such as for example the Xbox kernel.
The XboxDevWiki has a useful wiki page about the kernel, it has a list of functions and variables that it exports. Some of these exports contain nice words like “crypt” and “key”. There is a group of exports prefixed with “Xc” that are related to cryptography and some sort of LAN key:
export | ordinal | type |
---|---|---|
XcSHAInit | 335 | stdcall |
XcSHAUpdate | 336 | stdcall |
XcSHAFinal | 337 | stdcall |
XcRC4Key | 338 | stdcall |
XcRC4Crypt | 339 | stdcall |
XcHMAC | 340 | stdcall |
XcModExp | 345 | stdcall |
XcDESKeyParity | 346 | stdcall |
XcKeyTable | 347 | stdcall |
XcBlockCrypt | 348 | stdcall |
XcBlockCryptCBC | 349 | stdcall |
XboxLANKey | 353 | variable |
Searching for these on the web leads to a header file in the
nxdk project, an open source SDK for the Xbox. It lists the prototype
declarations of all exported kernel functions, including the XcBlockCryptCBC
function which looks promising:
XBAPI VOID NTAPI XcBlockCryptCBC
(
IN ULONG dwCipher,
IN ULONG dwInputLength,
OUT PUCHAR pbOutput,
IN PUCHAR pbInput,
IN PUCHAR pbKeyTable,
IN ULONG dwOp,
IN PUCHAR pbFeedback
);
As mentioned by the table of exports, the functions use the stdcall
calling convention where all arguments are pushed to the stack in
right-to-left order. If we can place a breakpoint when entering the
XcBlockCryptCBC
function, we can read the input and output arguments from the
stack in order to inspect both the encrypted and decrypted messages.
Unfortunately we do not have a symbol table so we do not know where this
function is located in the program memory. Actually, we do not even have the
executable, the kernel is compressed and encrypted within the BIOS ROM.
Again, the emulator must be able to read it. The kernel wiki page
mentions that the kernel base address is 0x8001_0000
. It might then be ready
and loaded at that address already. If so, we can dump that region to a file
using GDB. We need an end address so we will go ahead with, say 1 MiB from the
base address:
(gdb) dump binary memory xboxkrnl.exe 0x80010000 0x80110000
Cannot access memory at address 0x8003d000
Oh, okay, I guess that is where it ends then, making it 180 KiB3.
(gdb) dump binary memory xboxkrnl.exe 0x80010000 0x8003d000
We can now try to use e.g. objdump
to see if it can recognize the executable
type and is able to find the exports:
$ objdump -x xboxkrnl.exe
xboxkrnl.exe: file format coff-i386
:
Magic 010b (PE32)
:
Export Table:
DLL name: xboxkrnl.exe
Ordinal base: 1
Ordinal RVA Name
1 0x21281
:
335 0x20bcd /* XcSHAInit */
336 0x20bd3 /* XcSHAUpdate */
337 0x20bd9 /* XcSHAFinal */
338 0x20bdf /* XcRC4Key */
339 0x20be5 /* XcRC4Crypt */
340 0x20beb /* XcHMAC */
:
345 0x20c25 /* XcModExp */
346 0x20c41 /* XcDESKeyParity */
347 0x20c47 /* XcKeyTable */
348 0x20c4d /* XcBlockCrypt */
349 0x20c69 /* XcBlockCryptCBC */
:
353 0x2bf08 /* XboxLANKey */
:
366 0x6193
It can indeed! The file format is detected as PE32, which is an executable
format also used by Windows. The kernel has 364 exports, from ordinal 1 to 366
but with 7 and 11 missing. The XcBlockCryptCBC
function has ordinal 349 so it
should be located at 0x8001_0000
+ 0x0002_0c69
= 0x8003_0c69
in memory.
We can place a breakpoint at this address while the host has created a game and
the guest is searching:
(gdb) break *0x80030c69
Breakpoint 1 at 0x80030c69
(gdb) continue
Continuing.
Breakpoint 1, 0x80030c69 in ?? ()
Examining the stack we can inspect the arguments:
(gdb) x/8xw $sp
0xd00f38cc: 0x001b3fb4 0x00000001 0x00000020 0xd00086a8
0xd00f38dc: 0xd00086a8 0xd00f38f0 0x00000001 0xd00f3a70
The only value that can reasonably be the input length is the third (0x20
),
so I assume that the first value is the return address pushed by the x86
call
instruction and the arguments start from the second word, giving us
the following set of values for the arguments:
dwCipher = 1
dwInputLength = 0x20 /* = 32 */
pbOutput = 0xd00086a8 /* below the stack */
pbInput = 0xd00086a8 /* below the stack */
pbKeyTable = 0xd00f38f0 /* on the stack */
dwOp = 1
pbFeedback = 0xd00f3a70 /* on the stack */
Since the length is only 32 bytes, this might be the initial broadcast message from the guest and it is about to encrypt it. We can dump it to a file and read it with hexdump directly from GDB:
(gdb) dump binary memory out 0xd00086a8 0xd00086a8+0x20
(gdb) !hexdump -C out
00000000 14 21 14 20 00 18 00 00 01 0c 01 14 21 00 01 1f |.!. ........!...|
00000010 36 cb 38 ae 61 da a6 00 01 02 03 04 05 06 06 11 |6.8.a...........|
00000020
It contains no plain text data but it is clearly not encrypted or random data. Let us continue and see if we can find the response:
(gdb) continue
Continuing.
Breakpoint 1, 0x80030c69 in ?? ()
(gdb) x/8xw $sp
0x8003a3e8: 0x001b3fb4 0x00000001 0x00000130 0x8200c83c
0x8003a3f8: 0x8200c83c 0x8003a40c 0x00000000 0x8003a58c
dwCipher = 1
dwInputLength = 0x130 = 304
pbOutput = 0x8200c83c /* kernel memory? */
pbInput = 0x8200c83c /* same as output */
pbKeyTable = 0x8003a40c /* within loaded kernel image */
dwOp = 0
pbFeedback = 0x8003a58c /* within loaded kernel image */
Now, the dwinputLength
is 304 bytes which means that this may be the response
from the host. The dwOp
argument is set to 0 which might specify decryption
while 1 specifies encryption. We can run until we reach the return address and
dump the buffer after it has hopefully been decrypted:
(gdb) until *0x001b3fb4
0x001b3fb4 in ?? ()
(gdb) dump binary memory out 0x8200c83c 0x8200c83c+0x130
(gdb) !hexdump -C out
00000000 14 20 14 21 01 20 00 00 |11 8c|01|1f 36 cb 38 ae |. .!. ......6.8.|
00000010 61 da a6|53 0c 32 bb ba 20 e8 58|6b d2 9c c4 5a |a..S.2.. .Xk...Z|
00000020 78 b6 2f|3a 85 08 da 03 5a cb 8e 31 b3 ac 57 49 |x./:....Z..1..WI|
00000030 9c d9 35|0c 00|00 12 5a 23 23 83|00 00 00 00 20 |..5....Z##..... |
00000040 14 01 00 00 00 48|00 6f 00 77 00 61 00 72 00 64 |.....H.o.w.a.r.d|
00000050 00 00 00 00 00 00|00 00 00 00 00 00 00 00 00 00 |................|
*
00000080 00 00 00 6c 65 76 65 6c 73 5c 74 65 73 74 5c 62 |...levels\test\b|
00000090 6c 6f 6f 64 67 75 6c 63 68 5c 62 6c 6f 6f 64 67 |loodgulch\bloodg|
000000a0 75 6c 63 68 00 00 00 00 00 00 00 00 00 00 00 00 |ulch............|
000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000100 00 00 00 02 00 01 00 01 00 10 00 19 00 02 00 3e |...............>|
00000110 21 3e 5e 61 67 65 20 69 6e 20 61 20 62 6f 74 02 |!>^age in a bot.|
00000120 01 02 03 04 05 06 06 11 96 d7 91 dc 81 8e 92 6e |...............n|
00000130
Oh, now we have some plain text data! The name of my Xbox “Howard” is there (seemingly, in UCS-2 encoding). The message also seems to contain the file path to the selected map and some odd “age in a bot” string?
Decrypting the traffic
Now we could potentially just go ahead and try to analyze the messages using GDB, but it is quite awkward and it also interferes with the gameplay as it adds some delay. We will therefore do some more investigating to see if we can find a way to decrypt the traffic from a separate program without affecting the execution of the emulator.
We know that the encryption and decryption happens in the XcBlockCryptCBC
function. We are not entirely sure what kind of cipher is used, but from the
name we can guess that it is a block cipher with cipher block
chaining (CBC). The Xc
kernel module also has a function by the name
XcDESKeyParity
, so perhaps the DES cipher with CBC is used. The
keyTable
argument might be the subkeys used to decrypt each block.
DES uses an 8-byte key (56 key bits, 8 parity bits) and an initialization vector (IV) that matches the block size (8 bytes). There is also a successor to DES called TripleDES that instead takes three DES keys, i.e. a 24-byte key.
We can see if we can find some higher level function that calls the
XcBlockCryptCBC
function. We can break at the entry of XcBlockCryptCBC
again and look at the stack:
(gdb) break *0x80030c69
Breakpoint 1 at 0x80030c69
(gdb) continue
Continuing.
Breakpoint 1, 0x80030c69 in ?? ()
(gdb) x /8wx $sp
0xd00f38cc: 0x001b3fb4 0x00000001 0x00000020 0xd00085a8
0xd00f38dc: 0xd00085a8 0xd00f38f0 0x00000001 0xd00f3a70
Again, we have a return address as the last element in the stack, and it is the
same as last time: 0x001b3fb4
. The function was presumably entered by using
the call
instruction which is 5 bytes long, so we can see if it lies just
above the return address:
(gdb) x /5i 0x001b3fb4-5
0x1b3faf: call 0x19346
0x1b3fb4: pop %esi
0x1b3fb5: leave
0x1b3fb6: ret $0x18
0x1b3fb9: movzbl 0xe(%ecx),%edx
It does, and it is calling a function located at 0x19345
. That function is
actually located in the game binary, rather than in the kernel. But at
0x19345
is a jump to the kernel. The function that we see the instructions
for here ends right after calling XcBlockCryptCBC
. We can check what the
whole function looks like:
(gdb) (gdb) x /31i 0x001b3fb4-80
0x1b3f64: push %ebp
0x1b3f65: mov %esp,%ebp
0x1b3f67: sub $0x188,%esp
0x1b3f6d: xor %eax,%eax
0x1b3f6f: cmpl $0x8,0x10(%ebp) ; arg3_10h == 8?
0x1b3f73: push %esi
0x1b3f74: push 0xc(%ebp) ; pbKey <- arg2_ch
0x1b3f77: setne %al
0x1b3f7a: mov %eax,%esi
0x1b3f7c: lea -0x188(%ebp),%eax
0x1b3f82: push %eax ; pbKeyTable
0x1b3f83: push %esi ; dwCipher
0x1b3f84: call 0x1934c ; XcKeyTable
0x1b3f89: mov 0x14(%ebp),%eax ; arg4_14h
0x1b3f8c: mov (%eax),%ecx
0x1b3f8e: mov 0x4(%eax),%eax
0x1b3f91: mov %eax,-0x4(%ebp)
0x1b3f94: lea -0x8(%ebp),%eax
0x1b3f97: push %eax ; pbFeedback <- arg4_14h
0x1b3f98: push 0x8(%ebp) ; dwOp <- arg1_8h
0x1b3f9b: lea -0x188(%ebp),%eax
0x1b3fa1: push %eax ; pbKeyTable
0x1b3fa2: push 0x18(%ebp) ; pbInput <- arg5_18h
0x1b3fa5: mov %ecx,-0x8(%ebp)
0x1b3fa8: push 0x18(%ebp) ; pbOutput <- arg5_18h
0x1b3fab: push 0x1c(%ebp) ; dwInputLength <- arg6_1ch
0x1b3fae: push %esi ; dwCipher
0x1b3faf: call 0x19346 ; XcBlockCryptCBC
0x1b3fb4: pop %esi
0x1b3fb5: leave
0x1b3fb6: ret $0x18
This function uses 6 arguments. From the call to XcBlockCryptCBC
we can
immediately tell that
- argument 5 is an in-place buffer used for both input and output,
- argument 6 is the length of the buffer data,
- argument 1 determines if it is an encryption (=1) or decryption (=0).
The earlier call is to XcKeyTable
which might generate the subkeys from the
input key, i.e. key schedule. We can also check its signature from the
xboxkrnl.h
header. It takes argument 2 for its pbKey
argument which means
that
- argument 2 is the encryption/decryption key.
Let us try to place a breakpoint at the function entry and look at the stack to see what arguments it receives the next time we enter it:
(gdb) break *0x1b3f64
Breakpoint 2 at 0x1b3f64
(gdb) continue
Continuing.
Breakpoint 2, 0x001b3f64 in ?? ()
(gdb) x/7wx $sp
0x8003a598: 0x001b5400 0x00000000 0xd0042cac 0x00000018
0x8003a5a8: 0x8200d034 0x8200d03c 0x00000130
The third argument seems to be a length, it is currently 0x18
= 24.
Interestingly, if the third argument is 8, dwCipher
is set to 0, otherwise to
1. This might mean that
- argument 3 is the key length,
and dwCipher=0
specifies the DES cipher while dwCipher=1
specifies the
TripleDES cipher.
That leaves us with only argument 4 remaining, which is used as the
pwFeedback
argument to XcBlockCryptCBC
. As mentioned earlier, DES also
takes an IV as input, so
- argument 4 might be the IV.
We can try to write the values for each argument and check the buffer before and after the function has executed using a GDB command:
define cbc_crypt
set $encrypt = *(int*)($sp+0x4)
set $key = *(char**)($sp+0x8)
set $key_len = *(int*)($sp+0xc)
set $iv = *(char**)($sp+0x10)
set $buf = *(char**)($sp+0x14)
set $buf_len = *(int*)($sp+0x18)
printf "key @ %p+%d:\n", $key, $key_len
dump binary memory key $key $key+$key_len
!hexdump -vC key
printf "iv @ %p+%d:\n", $iv, 8
dump binary memory iv $iv $iv+8
!hexdump -vC iv
printf "input @ %p+%d:\n", $buf, $buf_len
dump binary memory input $buf $buf+$buf_len
!hexdump -vC input
# run until ret
until *$sp
if $encrypt == 1
printf "encrypted:\n"
else
printf "decrypted:\n"
end
dump binary memory out $buf $buf+$buf_len
!hexdump -vC out
end
Running it yields:
(gdb) cbc_crypt
key @ 0xd0042cac+24:
00000000 c2 b0 31 10 79 2a 97 e5 91 bf ec 52 ce 80 fe f7 |..1.y*.....R....|
00000010 ad 19 80 c2 89 38 01 62 |.....8.b|
00000018
iv @ 0x8200d034+8:
00000000 57 9a 33 61 d1 66 7d aa |W.3a.f}.|
00000008
input @ 0x8200d03c+304:
00000000 bc 01 19 e6 1f cf 0f 49 72 f8 10 6a 48 b0 09 b3 |.......Ir..jH...|
00000010 d8 06 c7 f7 77 82 ed 03 91 82 8c 05 54 4b 74 26 |....w.......TKt&|
00000020 45 95 1e 45 fd 73 65 80 70 fd 9d f2 5a 63 6f ea |E..E.se.p...Zco.|
00000030 b5 ab d2 0f a7 05 40 34 9d a0 31 86 09 34 b5 2b |......@4..1..4.+|
00000040 74 eb 3b b6 7e 3f cf 2c b8 1d e7 54 86 56 90 2a |t.;.~?.,...T.V.*|
00000050 79 eb 88 91 5a 85 eb 84 c9 45 bf 5b cb c4 f1 5a |y...Z....E.[...Z|
00000060 9c ff 1f 12 3d 78 c6 a8 ee 8c dc da c8 4e 59 19 |....=x.......NY.|
00000070 53 3b ba 3e 8f 10 32 95 d0 ab 4f f7 10 64 fe 3f |S;.>..2...O..d.?|
00000080 26 92 a8 c8 cd 66 8f cc ba d0 9e 22 69 c7 dc 4b |&....f....."i..K|
00000090 c6 ec 2a d3 80 50 38 e5 1e 08 76 21 87 78 56 b1 |..*..P8...v!.xV.|
000000a0 7b 85 a6 c4 2a 0e e8 a5 5b 29 c3 39 58 84 a9 bd |{...*...[).9X...|
000000b0 06 67 fe c4 fa e4 7b 03 63 ca 9c 5d 55 6d 15 10 |.g....{.c..]Um..|
000000c0 3c bf 25 c7 b9 4c 7a d2 7c 04 05 78 fd 3c 91 c7 |<.%..Lz.|..x.<..|
000000d0 18 67 a7 0e 3d d6 a9 aa 31 68 d7 bb 10 eb 95 c7 |.g..=...1h......|
000000e0 ca 0e e0 fa 1e 74 2f 5a 68 a4 45 3a 7a 7c 99 da |.....t/Zh.E:z|..|
000000f0 79 0e 30 2a 36 13 a4 fe a3 87 a6 78 7c e6 70 5f |y.0*6......x|.p_|
00000100 8e c1 96 d7 7e 5f c9 8f 56 8d 1b 2d ed 87 75 f6 |....~_..V..-..u.|
00000110 4d 18 5b 64 03 6e e9 be 9b 2a 67 aa 04 c8 77 69 |M.[d.n...*g...wi|
00000120 08 a5 e3 af fe f4 9b b8 0c 70 77 2b 00 79 dc a5 |.........pw+.y..|
00000130
0x001b3fb6 in ?? ()
decrypted:
00000000 14 20 14 21 01 20 00 00 11 8c 01 40 b6 43 66 a7 |. .!. .....@.Cf.|
00000010 b5 22 db ed 67 79 42 41 96 0c bd 64 96 aa 34 72 |."..gyBA...d..4r|
00000020 ca 5e 97 3d e7 7a e4 79 35 18 53 b3 59 8d 66 5c |.^.=.z.y5.S.Y.f\|
00000030 c7 ed 84|0c 00|00 12 5a 23 23 83|00 00 00 00 20 |.......Z##..... |
00000040 14 01 00 00|00 48 00 6f 00 77 00 61 00 72 00 64 |.....H.o.w.a.r.d|
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000060 00 00 00 00|00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000080 00 00 00|6c 65 76 65 6c 73 5c 74 65 73 74 5c 62 |...levels\test\b|
00000090 6c 6f 6f 64 67 75 6c 63 68 5c 62 6c 6f 6f 64 67 |loodgulch\bloodg|
000000a0 75 6c 63 68 00 00 00 00 00 00 00 00 00 00 00 00 |ulch............|
000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
000000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000100 00 00|00 02|00 01|00 01| 00 10|00 19|00 02|00|3e |...............>|
00000110 21 3e 5e|61 67 65 20 69 6e 20 61 20 62 6f 74|02 |!>^age in a bot.|
00000120 01 02 03 04 05 06 06 11 50 16 3e 60 ac ff 0d d0 |........P.>`....|
00000130
We can use OpenSSL to verify if it is actually using TripleDES CBC. After some trial and error, I managed to find a set of arguments that successfully decrypts the message with the above key and IV:
$ openssl des-ede3-cbc -d -nopad \
-K $(hexdump -ve '/1 "%02x"' key) \
-iv $(hexdump -ve '/1 "%02x"' iv) \
-in input \
| hexdump -C
00000000 14 20 14 21 01 20 00 00 11 8c 01 40 b6 43 66 a7 |. .!. .....@.Cf.|
00000010 b5 22 db ed 67 79 42 41 96 0c bd 64 96 aa 34 72 |."..gyBA...d..4r|
00000020 ca 5e 97 3d e7 7a e4 79 35 18 53 b3 59 8d 66 5c |.^.=.z.y5.S.Y.f\|
00000030 c7 ed 84 0c 00 00 12 5a 23 23 83 00 00 00 00 20 |.......Z##..... |
00000040 14 01 00 00 00 48 00 6f 00 77 00 61 00 72 00 64 |.....H.o.w.a.r.d|
00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000080 00 00 00 6c 65 76 65 6c 73 5c 74 65 73 74 5c 62 |...levels\test\b|
00000090 6c 6f 6f 64 67 75 6c 63 68 5c 62 6c 6f 6f 64 67 |loodgulch\bloodg|
000000a0 75 6c 63 68 00 00 00 00 00 00 00 00 00 00 00 00 |ulch............|
000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00000100 00 00 00 02 00 01 00 01 00 10 00 19 00 02 00 3e |...............>|
00000110 21 3e 5e 61 67 65 20 69 6e 20 61 20 62 6f 74 02 |!>^age in a bot.|
00000120 01 02 03 04 05 06 06 11 50 16 3e 60 ac ff 0d d0 |........P.>`....|
00000130
Now we can try to observe the arguments for the “cbc_crypt” function for more packets, allowing us to make some observations:
- The 24-byte key is always the same, it is symmetric and is used for all broadcast messages.
- After joining a game, two new 8-byte symmetric keys are used, one for messages encrypted by the host and one for messages encrypted by the guest. The same pair of keys are used until the game is closed or the guest leaves.
- The crypt function is not entered for all packets, some packets are sent without encryption.
-
The UDP payload of encrypted packets looks similar to an IPSec ESP in
transport mode. The payload seems to contain
-
A 4-byte Security Parameter Index (SPI). It is set to
0xffff_ffff
for broadcasts, otherwise:- 1 byte indicating whether the packet is encrypted (=1) or not (=0),
- 2 bytes to identify the security association,
- 1 byte set to zero or one (unknown),
-
a 4-byte sequence number, set to
0xffff_ffff
for broadcasts, incremented once for each packet after a security association is established, - an 8-byte IV, used to decrypt the message,
-
an encrypted payload, containing
- an 8-byte UDP header OR a 20-byte TCP header,
-
a 2-byte value that is set to
(l << 4) | 0xc
wherel
is the length of the encrypted payload minus the transport layer header and padding. - 1 byte always set to one.
- a variable-length message,
- a variable number of incrementing padding bytes,
- 1 byte specifying pad length,
- 1 byte specifying IP protocol (UDP or TCP),
- a 12-byte checksum or Integrity Check Value (ICV) of the packet.
-
A 4-byte Security Parameter Index (SPI). It is set to
Typically, ESP is used directly on top of IP but, interestingly, they have chosen to use it on top of UDP and then also use UDP or TCP packets inside the encrypted packets.
A pre-shared key is used for broadcasts and first when joining, a type of security association is established where the two instances agree on a pair of keys to use for the remaining packets.
We can also confirm that the algorithm using 8-byte keys is DES, the below openssl command successfully decrypts messages with a given key and IV:
$ openssl des-cbc -provider legacy -d -nopad \
-K $(hexdump -ve '/1 "%02x"' key) \
-iv $(hexdump -ve '/1 "%02x"' iv) \
-in input
Key derivation
The whole encryption stuff got me intrigued and I am now curious how it is set up. If we investigate further, we might be able to create a program that can read the packets without relying on an emulator or debugger to obtain the keys.
In order to more easily understand what is going on, we will run two emulators that connect to each other, each with a GDB instance attached to it. We will also use tcpdump to record the packets to a file (.pcap). Each GDB instance will run a command that sets up breakpoints at points of interest, such as when entering a system call. Whenever a breakpoint is hit, it checks the PC to determine which one was triggered, and then dumps internal state such as inputs and outputs in a readable format. We can then see if any of the inputs or outputs from the different system calls match each other or parts of the packets and therefore follow the derivation of the keys. The workflow looks like this:
- Start two game instances, each with a separate GDB server.
-
Run a script that:
- connects a GDB client to each GDB server and loads a GDB command,
- launches tcpdump to record packets to a file,
- Do something interesting, e.g. searching for or joining a game,
- Inspect the traces of internal state and packet contents,
- Update the GDB command with breakpoints or watchpoints at new locations of interest,
- Repeat from 2.
After some manual investigations like this we have mapped out how each client derives the DES keys and where each piece of data comes from. The process for each client looks something like this:
The XcHMAC
function is used extensively. It generates a typical hash-based
message authentication code using SHA-1. It takes a 16-byte key, a
variable-length input and generates a 20-byte message authentication
code. We can replicate the implementation using e.g. OpenSSL’s SHA-1
implementation.
By reimplementing the system calls, most of the things in the derivation can easily be replicated by a program of ours. Most inputs are just constants, and some are from the packets. There is, however, one concerning source of data here: a pseudo-random number generator (PRNG). How could our program possibly have a PRNG that generates the exact same number as the game instance? Also, how can two game instances derive the same key with two different random numbers as inputs?
Looking at the traces from the debugger, the two instances do indeed have
completely different exponent inputs. However, they also have different base
inputs, and they end up with the exact same output. The function in question is
the XcModExp
function, which performs modular exponentiation, i.e.
it calculates
. In this case each value is a
768-bit integer (96 bytes) so brute forcing would take a while. The exponent
is random and the base is from one of the messages, i.e. it comes from the
other client. Looking further back in the trace we can see that the clients
first called XcModExp
to generate one number each4:
where is a shared constant5 found in the game binary and , are randomly generated numbers. The instances then share and with each other in the handshake packet and then calculate
and
respectively. Both end up with the same value , without ever knowing the other instance’s -value. This is known as the Diffie-Hellman key exchange. As a bystander, we only know , and , without neither or it is very difficult for us to determine 6. But for us to be able to derive the keys, we will have to somehow determine .
Hmm… If we were to act as a client, we could easily mimic what the game does and get another client to generate an -value together with us. If we can do that with one of the clients, we should be able to do it with two clients simultaneously. What if we could calculate an -value for two instances, and then act as an intermediary between them… What if we could act as a…
Man in the middle
It seems like it might be (practically) impossible for a listener to obtain the keys used between two parties who have successfully established a security association. However, it should be possible for us to create two security associations, one with each party, and then simply shuffle the messages between them by decrypting and re-encrypting each packet. It’s like playing chess with two other players at the same time and just relaying their moves, effectively making them play against each other. This can be considered a man-in-the-middle (MITM) attack. Performing such an attack would also enable us to alter the packets without the clients knowing.
In order to perform the attack we will write a C program that reads and listens
to a network socket. We will be using a raw socket which picks up all packets,
regardless of the destination IP address7. When running Linux, we can
achieve that by opening a socket with the AF_PACKET
address family, the
SOCK_RAW
socket type and the ETH_P_ALL
protocol as described by
packet(7)
:
int sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL));
if (sockfd == -1) {
perror("socket failed");
return EXIT_FAILURE;
}
We are now able to read packets from all interfaces using recv(2)
on the
socket:
uint8_t p[BUFSIZE];
ssize_t p_len = recv(sockfd, p, sizeof(p), 0);
if (p_len == -1)
return;
But in order to be able to send packets via the socket, we have to bind it to a
specific interface. We can look up the index of a specific interface using
ioctl(2)
and then bind it:
struct ifreq ifr;
strcpy(ifr.ifr_name, net_interface);
size_t ret_ifindex = ioctl(sockfd, SIOCGIFINDEX, &ifr);
assert(ret_ifindex == 0);
struct sockaddr_ll sll = {
.sll_family = AF_PACKET,
.sll_ifindex = ifr.ifr_ifindex,
.sll_protocol = htons(ETH_P_ALL),
};
if (bind(sockfd, (struct sockaddr *)&sll, sizeof(sll)) == -1) {
perror("bind failed");
exit(EXIT_FAILURE);
}
We can now send packets over the interface using send(2)
:
int send_len = send(sockfd, p, p_len, 0);
if (send_len == -1) {
perror("send failed");
exit(EXIT_FAILURE);
}
As we saw previously, the searching guest client will send out a broadcast packet to which all hosts with an active game will reply to. If we just copy a broadcast message in full, and broadcast it ourselves, we receive replies from the hosts as well. However, if we decrypt it, modify any byte and re-encrypt it, we no longer receive a reply.
Recall that there was an ICV, i.e. a checksum of the packet contents that was
appended to the end of the packet. Can we alter the packet and simply
recalculate the ICV? I think we can. Looking at our trace from GDB, after a
packet has been encrypted and before it is sent over the network, a call to
XcHMAC
is performed with the packet contents (starting from the UDP payload)
as the data and a constant value as the key. The first 12 bytes of the output
then matches the 12 last bytes of the packet as seen by tcpdump. We can simply
reuse that key and generate a new ICV after we have altered the packet.
Now we can try to decrypt the packet, alter some of the bytes, re-encrypt it and append it with a correct ICV. Depending on what bytes we alter, we still get a reply. After some experimentation with altering various parts in the packets while inspecting the traces, the two broadcast packets seem to be structured like this8:
/* guest searching for games */
struct broadcast_g2h {
struct udphdr udp;
uint16_t packet_size; /* (0x10 << 4) | 0xc */
uint8_t _ah; /* 01 */
uint8_t udp_port_guest[0x2];
uint8_t _eh[0x2]; /* 00 01 */
uint8_t guest_id[0x8];
uint8_t packet_type; /* 00 */
};
/* host responding to search */
struct broadcast_h2g {
struct udphdr udp;
uint16_t packet_size; /* (0x118 << 4) | 0xc */
uint8_t _ah; /* 01 */
uint8_t guest_id[0x8];
uint8_t host_id[0x8];
uint8_t hmac_data0[0x8];
uint8_t hmac_data1[0x10];
uint8_t _33h[0x2]; /* 0c 00 */
uint8_t mac_src[ETH_ALEN];
uint8_t _3bh[0x9]; /* 00 00 00 00 20 14 01 00 00 */
char host[0x20];
uint8_t _58h[0x1f]; /* 00 .. */
char level_path[0x7f];
uint16_t game_rules;
uint16_t _104h; /* 00 01 */
uint16_t number_of_players;
uint16_t _108h; /* 00 10 */
uint16_t score_limit;
uint16_t mode;
uint8_t _10eh; /* 00 */
uint8_t game_status[0x4];
uint8_t age_in_a_bot[0xc];
uint8_t packet_type; /* 02 */
};
Both packets start with a UDP header and end with a byte that specifies what type of packet it is. This byte at least seems to be constant and unique to each type of packet so we will use it as a packet type identifier, regardless of its intended meaning.
At first, the guest initially sends a packet that specifies an ID for itself. The host then replies with the same guest ID, and another ID for itself. It also specifies some random data that is used in some HMAC calls on both sides to generate the keys, as seen in the derivation diagram above. The host also specifies its own MAC address so the guest knows where to send unicast packets. Finally, the broadcast contains the game name and game options that we are able to see on the screen when selecting a game to join.
Then, when the guest actually chooses to join a game, the guest client sends an unencrypted handshake unicast packet to the host MAC address to which the host immediately replies with a similar packet. The two handshake packets are structured like this:
struct handshake {
uint8_t _0h[0xc]; /* 00 00 00 00 00 XX XX 00 00 00 00 00 */
uint8_t hmac_data1_guest[0x8]; /* randomized by guest */
uint8_t hmac_data1_host[0x8]; /* randomized by host */
uint8_t mod_exp_base[MOD_EXP_LEN];
uint8_t broadcast_hmac_data0[0x8];
uint8_t _7ch[0x2]; /* 0c 00 */
uint8_t mac_src[ETH_ALEN];
uint8_t _84h[0xb]; /* 00 00 00 00 XX XX XX e9 fc d8 d9 */
uint8_t packet_type; /* 01 */
};
The guest sends some randomly generated HMAC data which is used during the key
derivation, as seen in the above diagram. The hmac_data1_host
field is just
filled with zeros. The guest also includes its
value that will be used
as the base in the XcModExp
call that we mentioned earlier. The guest also
resends the HMAC data it received from the host’s broadcast. It also includes
its MAC address so the host can send unicast packets to it.
The host then replies with a similar packet. But it writes its randomly
generated HMAC data to the hmac_data1_host
field and provides its
value for XcModExp
. Both clients now use the exchanged information to derive
a pair of DES keys that will be used for all encrypted packets from here
onwards. Similarly, they will derive a pair of keys used for the ICV for the
remaining packets.
So, let us try to intercept this process. We want to make the clients think that they are communicating with each other, while in reality they are both talking directly to us and we are just passing the messages along (with some small harmless alterations).
Since no essential information is only present in the broadcast from the guest, we can just ignore that. But as soon as we see a broadcast reply from the host, we will immediately copy that and broadcast it with some alterations.
We want the guest to send the handshake to us instead, so we will just
overwrite the mac_src
field that contains the host MAC address with the
address of our own network interface. Now when we press join, the guest sends
the handshake packet to our address instead and the host simply ignores it and
does not respond with a handshake packet9. The guest client
then displays an error message because we did not respond.
Next, we will simply take the handshake packet from the guest and resend it to
the host (we know the host MAC address from the mac_src
field in its
broadcast that we intercepted). Because we do not know the secret
value
of the guest, we will also have to modify the
value before we pass it to
the host. Remember, the host will use it to calculate
We need to select the value of
such that we can know what value of e
the host will obtain. Any ideas? How about
? Then we do not even need
to implement the XcModExp
function because we can perform the 768-bit integer
calculation right here and now:
Easy! So if we provide to the host, we know it will get , so we will also use that number for the key derivation.
Next, we will receive a reply from the host with a handshake packet. We now have both the guest and host HMAC data so we are able to derive the DES and ICV keys to be used for the remaining packets. We can then overwrite the host’s value in the handshake packet with 1 as well, replace the MAC address with ours and pass it along to the guest.
After that we are in, when joining the game, we successfully enter the game through our MITM client! The attack can be summarized with the following sequence diagram:
All remaining packets after the handshake can be passed through without any modifications. Each packet is sent to our client, decrypted and re-encrypted and then passed along to the other client using our derived keys. We no longer need an emulator with a debugger attached to read messages. We are now able to read and even alter packets sent between two completely unmodified consoles!
The game network protocol
The time has come, we are now finally able and willing to do what we set out to do in the first place, investigate the game network protocol!
After some experimenting with pushing various buttons and trying different player counts while watching the packets in real-time, it seems like the clients more or less just send the player inputs to each other. The input for each player is contained in a 30-byte struct that looks like this:
struct player {
uint32_t actions;
float yaw;
float pitch;
float forward;
float left;
float fire_duration;
int16_t selected_weapon;
int16_t selected_grenade;
int16_t zoom;
};
The yaw
is set to the current direction that the player is facing, the
pitch
is the current pitch10. forward
is the ground
acceleration in the forward direction, left
for sideways ground acceleration
with left as positive. The fire_duration
starts counting from 0 when pressing
fire and resets to zero when releasing, incrementing around 0.1 per second.
selected_weapon
is the index of the currently equipped weapon. Similarly for
selected_grenade
but it is -1 when one has run out of grenades. zoom
is -1
initially and becomes 0 for the first level of zoom and 1 for the second. These
represent the current state and will not change if you e.g. try to switch
grenades when you only have one type.
The actions
field contains bits for at least 9 actions:
enum action {
CROUCH = 1 << 0x0, /* left stick */
JUMP = 1 << 0x1, /* A */
TOGGLE_FLASHLIGHT = 1 << 0x4, /* white */
RELOAD = 1 << 0x6, /* X */
MELEE = 1 << 0x7, /* B */
FIRE = 1 << 0xb, /* right trigger */
THROW_GRENADE0 = 1 << 0xc, /* left trigger */
THROW_GRENADE1 = 1 << 0xd, /* left trigger */
PICK_UP = 1 << 0xe, /* X hold */
};
Each bit is high as long as its corresponding button is held down, although the
PICK_UP
only goes high after holding the “X” button for a short while. All of
the actions will go high when you press their corresponding button regardless
if the action can be performed or not. E.g. if you try to fire with no ammo,
the FIRE
action will still be active.
These player structs are then placed in a continuous array, with one element for each player. The guest sends one struct for each player that is local to their console. The packet from the guest to the host is structured like this:
/* sent from each guest to the host once every tick */
struct ingame_g2h {
struct udphdr udp;
uint16_t packet_size; /* (0x09 + player_count * 0x1e) << 4 | 0xc */
uint8_t _10h; /* 01 */
uint8_t tick[0x4];
uint8_t player_count;
struct player players[]; /* only players on guest console */
};
The guest packet also contains the index of the current tick and some unknown
data. It is not part of the struct definition here, but there is also a single
byte after the player array that is always set to 0x19
11.
The host replies with not just the player structs for all remaining players, but it also resends the player structs of the guest that it replies to. The packet is structured like this:
/* sent from the host to every guest once every tick */
struct ingame_h2g {
struct tcphdr tcp;
uint16_t packet_size; /* ((0x11 + player_count * 0x1e) << 4) | 0xc */
uint8_t _16h; /* 01 */
uint8_t tick_a[0x4];
uint8_t hash[0x4]; /* XX XX XX XX */
uint8_t tick_b[0x4];
uint8_t player_count;
struct player players[]; /* all players in game */
};
The host packet contains two tick counters that are usually the same but
sometimes go out of sync. They are typically 1 step ahead of the guest tick.
The hash
value is sometimes the same for multiple ticks, and changes more
often when players are not idle. After the player array, there is one byte that
is always set to 0x14
.
The clients are more or less only sending inputs between each other, and are then relying on each other to simulate the game identically. It seems like there is a risk for the clients to go out of sync if any packets were to be dropped or delayed. We can notice that when e.g. momentarily minimizing the emulator, causing it to not reply in time, the game thereafter becomes unplayable and eventually disconnects.
Altering in-game packets
So what would happen if we altered any of the game information sent between the clients while playing? We can try to modify packets from the host at first, e.g. make a host player walk left when firing:
struct ingame_h2g *game = (struct ingame_h2g *)msg;
struct player *p0 = (void *)game->players;
if (ntohl(p0->actions) & FIRE) {
p0->actions = htonl(ntohl(p0->actions) | CROUCH);
}
When the host and guest player are facing each other, and the host begins shooting, the guest sees the host shooting and walking left for a second before the screen changes to a full screen message:
ATTENTION
Your Xbox console has gone out of sync with the other consoles in the game. Resetting…
Press START to Continue
Then the guest attempts to “reset”12 but disconnects shortly thereafter. Interestingly, the clients did not just continue playing out of sync, the guest client quickly detected13 that they were out of sync and supposedly tried to reset.
What would happen if we instead modify the information from the guest to the host? Remember, the host resends the player data that it receives to all guest clients, including the guest that it received it from. Will the host use the modified data and the guest use the original data, causing them to go out of sync like we saw previously? Or will the guest check that the data is unchanged and reject it immediately? Or will it just accept the received data and use that instead of the data that it sent? We can try it out:
struct ingame_g2h *game = (struct ingame_g2h *)msg;
struct player *p0 = (void *)game->players;
if (ntohl(p0->actions) & TOGGLE_FLASHLIGHT) {
p0->forward = htonf(200 * ntohf(p0->forward));
p0->left = htonf(200 * ntohf(p0->left));
}
If a player from a guest client is holding the flashlight button, we will increase their speed before we send the packet off to the host. Let’s see what happens when we run it and hold down the flashlight button as a guest player:
We are able to run a lot faster than usual, even faster than a warthog jeep! I was really hoping for that to work and I am quite surprised it actually did! Heh, it is quite fun to fly around using the hills and valleys of Blood Gulch, sort of reminds of Tribes. As you can see, it is very easy to kill oneself from fall damage, though.
This must mean that the guest client sends its controls to the host and assumes it gets back the same without actually checking it. The guest uses the response as the actual truth, completely ignoring what it actually sent. We are also lucky that the clients do not saturate the speed even though one can never run faster than 1.0 units.
Some other things I tried with the client packets:
- Changing the yaw or pitch causes the guest player camera to look one way and fire in another direction.
- Using a vehicle with increased speed: the warthog and scorpion seem to have binary speed, either 0 or 1 and are not affected at all by a higher speed value. The acceleration of the ghost does increase with a higher speed, but going way too fast it immediately leans forward and flips over, forcing you out of the vehicle.
- Modifying the fire duration does not seem to have any effect.
- Changing the selected grenade can prevent you from using grenades that you have but cannot let you use ones that you do not have.
If you have a Linux machine and want to try it out yourself, the MITM client is
available here. Just compile it, make sure it is allowed to use
raw sockets, and specify pal
/ntsc
and a network interface when running:
$ cc -lcrypto mitm_h1.c -o mitm_h1
# setcap cap_net_raw=eip mitm_h1
$ ./mitm_h1 pal eth0
You can also use -DSNEAKY
when compiling to make it perform the attack
silently, e.g. if you want to get the to the rocket launcher first every time
at the next LAN party (people still have those, right?). Just be discrete with
the laptop that is connected to the network switch and has console text quickly
scrolling by. :>
Summary
We tried to simply read the messages between two consoles playing Halo, but alas, we were immediately met by a stream of encrypted gibberish. By setting up a game instance in an emulator and attaching GDB to it we were able to peek at the contents of the secret messages that were sent between the instances. We also figured out that the game uses DES and TripleDES encryption and were able to decrypt the packets by peeking at the keys using GDB.
Investigating further, we determined how the instances derive the keys and establish a security association during their initial handshake. By intervening in that process, we were able to perform a man-in-the-middle attack, allowing us to read and alter any packets that were sent between two consoles.
Then, we finally took a look at the packets that were sent while playing. After figuring out their contents we were able to alter the packets from a guest client and implement a “speedhack” from a remote device for a game played by two unmodified consoles, using only a computer connected via a network switch.
The authentication primitives are part of the Xbox kernel and are potentially common for all System Link games. The method demonstrated here should be possible to replicate in order to perform similar attacks for other Xbox games, potentially enabling other more interesting hacks. For example, would it not be cool if there was a game with a stack buffer overflow bug that allowed us to run arbitrary code remotely? That could potentially enable us to softmod an Xbox over the network, without the need of any special hardware14.
This article has some discussion on Hacker News.
-
The Xbox dashboard has a PAL60 option to make games run at 60 Hz but Halo: Combat Evolved runs at 50 Hz with 25 FPS regardless.↩︎︎
-
Port 3074 is a registered port assigned by the IANA.↩︎︎
-
Is it possible to simply list the currently memory mapped regions of a remote GDB target instead of determining them with trial and error like this? E.g. like
info proc mem
for a native process.↩︎︎ -
The constant is represented with 96 bytes: 95 zeros followed by a two. I initially thought it was a big-endian 2 but since the
XcModExp
uses little-endian, the value will evaluated as . I am not sure if is a better value than 2 here and if it was even intentional to use that over simply 2.↩︎︎ -
The constant is the prime number , referred to as the “First Oakley Default Group” by RFC 2409. Although, it it is written in big-endian while
XcModExp
seems to use little-endian.↩︎︎ -
This is known as the Diffie-Hellman problem.↩︎︎
-
The packets are kind of odd, they always use the same destination IP addresses for all recipients (
255.255.255.255
for broadcasts and0.0.0.1
for unicasts). I tried to capture the packets by using UDP sockets that listen to the Xbox port. The broadcasts that were addressed to255.255.255.255
were picked up after binding toINADDR_ANY
(0.0.0.0
) orINADDR_BROADCAST
(255.255.255.255
). However, the packets addressed to0.0.0.1
were not, and I was also not able to bind the socket to0.0.0.1
because that address is not considered local. I am not sure if it is possible to listen to addresses not assigned to my machine? Would I be able to listen to it if I managed to statically assign0.0.0.1
to my machine?↩︎︎ -
Due to some fields not being aligned like a struct in C is typically aligned, we have to use
uint8_t
arrays in order to avoid padding.↩︎︎ -
Interestingly, the guest seems to take the latest packet it receives and just overwrite the earlier information it received rather than trusting the first packet and ignoring later packets. If we also choose to overwrite the host ID, we instead get two options two choose from in the game selection list. But by keeping the same host ID, we can silently act act as the original host.↩︎︎
-
The yaw and pitch is in radians from 0 to and to , respectively. The yaw is 0 when facing east (assuming the assault rifle compass is pointing north).↩︎︎
-
We cannot represent the elements after the player array in the struct because the number of players varies, instead we use a C99 flexible array member which just points to the start of the player array and ignore the byte after because it is constant anyway.↩︎︎
-
I am not sure how or if it is attempting to reset. In order to resynchronize, would the host not need to send its full game state? There are no additional packets being exchanged except for the normal ones.↩︎︎
-
How does it detect the desync, though? Perhaps they are using the
hash
value that is sent by the host. The host might calculate a hash of its entire game state, if the guest client then obtains a different hash for its game state compared to the one it receives by the host, it will present the desync message and try to reset. Modifying this hash value once using our MITM client causes the desync message to flash by for a second on the guest console and make the guest players immediately leave the game.Unfortunately, this makes it difficult to create a dedicated server. If it were not for this desync detection, it might have been possible to create a simple server that simply broadcasts all player information to the guests.↩︎︎
-
As far as I know, all current softmod methods still require special Xbox adapters in order to transfer the exploit program to the Xbox.↩︎︎