``` ||||| _ _ _ _ = Z . = | |_ | | |_ __ _ _ _ _ ___| |_ = , = | ' \| | | ' \| ' \ _| ' \/ -_) _| = o ` = |_||_|_|_|_|_|_|_||_(_)_||_\___|\__| ||||| ``` # Exploring The Halo 1 System Link Protocol _September 18, 2023_ One of my all-time favorite games is [_Halo: Combat Evolved_][h1] that launched together with the original [Xbox][] in 2001. One major part of the game was multiplayer, up to 16 people at a LAN party could play together by connecting four Xbox consoles over [Ethernet][], referred to as [_System Link_][]. Out of pure curiosity, we will try to look at how the Halo game instances communicate with each other over System Link. In order to do that, we will set up System Link, play some Halo multiplayer, and see if we can reverse engineer the network protocol by inspecting the messages that are sent between the consoles. Should be easy enough, right? [h1]: https://en.wikipedia.org/wiki/Halo:_Combat_Evolved [Xbox]: https://en.wikipedia.org/wiki/Xbox_(console) [Ethernet]: https://en.wikipedia.org/wiki/Ethernet [System Link]: https://en.wikipedia.org/wiki/System_Link Table of contents - 1 [System Link setup][] - 2 [Observing the traffic][] - 3 [Watching from inside][] - 4 [Decrypting the traffic][] - 5 [Key derivation][] - 6 [Man in the middle][] - 7 [The game network protocol][] - 8 [Altering in-game packets][] - 9 [Summary][] ## System Link setup I still have a fully functional unmodified original Xbox console. We will connect it to a computer using Ethernet cables via a network switch and run an Xbox emulator on the computer that will play together with the console. We will use the [xemu][] emulator, which is based on the excellent [QEMU][] project. The Xbox contains a more or less off-the-shelf Intel Pentium III, a 32-bit [x86][] CPU. If interested, you may find a good introduction to the Xbox architecture [here][copetti-xbox]. In order to run Halo with xemu, we will need some files, in particular: - a 6.9 GiB game disc, containing the game executable ([XBE][]) and all assets, - an HDD image in QEMU's qcow2 format, containing a dashboard that launches the game, - a 256 KiB [BIOS ROM][] (actually 1024 KiB, replicated 4 times), containing the BIOS, bootloader and the Xbox [kernel][xboxkrnl], - a 512 B [MCPX ROM][] which does some configurations at boot and then decrypts and runs the bootloader, - a 256 B [EEPROM][], with some device-specific information, e.g. the [MAC address][macaddr]. The xemu project provides pre-formatted [HDD images][xemu-hdd] that contain a dummy dashboard that we will use. Xemu will also automatically set up an EEPROM for us. For the MCPX ROM, we will use an open source [reimplementation][mcpx-mouse]. For the game disc and BIOS ROM, we will have to use dumped images. [xemu]: https://github.com/xemu-project/xemu [QEMU]: https://www.qemu.org/ [x86]: https://en.wikipedia.org/wiki/X86 [copetti-xbox]: https://www.copetti.org/writings/consoles/xbox/ [XBE]: https://xboxdevwiki.net/Xbe [BIOS ROM]: https://xboxdevwiki.net/BIOS [xboxkrnl]: https://xboxdevwiki.net/Kernel [MCPX ROM]: https://xboxdevwiki.net/MCPX_ROM [EEPROM]: https://xboxdevwiki.net/EEPROM [macaddr]: https://en.wikipedia.org/wiki/Mac_address [xemu-hdd]: https://xemu.app/docs/required-files/#hard-disk-image [mcpx-mouse]: https://github.com/SnowyMouse/fancy-mouse-boot-rom We will use [PAL region][PAL] copies of the game, which differ from the ones in NTSC regions. For example, the original NTSC version runs at 30 frames per second while the PAL version runs at 25 frames per second[^pal60]. [^pal60]: The Xbox dashboard has a PAL60 option to make games run at 60 Hz but Halo: Combat Evolved runs at 50 Hz with 25 FPS regardless. [PAL]: https://en.wikipedia.org/wiki/PAL_region Now we boot up Halo on the Xbox console, create a System Link game and connect the console to our computer with some Ethernet cables. We then boot up the emulator, set up the above mentioned files, make sure it is set up to use the Ethernet [network interface][netifc] and then search for System Link games. ``` SYSTEM LINK GAMES ------------------------------------------------------------------- ___________________________________________________________ |_Howard__________________ _____________ _____________ \ | | _ _ | | | | | | / \__/ \| | O | | | | |__/__\__| | /0\ | | | | T-_ _ | | / | \ | | | | o\_/ov | | / \ | | | |_____________| |_____________| | | | | Game Status: Accepting Players | | Map: Blood Gulch | | Game Rules: Slayer | | Mode: Free for All | | Number of Players: 1 | | Score Limit: 25 Frags | | | \_________________________________/ ------------------------------------------------------------------- Y = CREATE GAME B = BACK A = SELECT ``` And voila! We find the game hosted by the console and can play some Halo over LAN. [netifc]: https://en.wikipedia.org/wiki/Network_interface_controller ## Observing the traffic While searching for games, packets should be sent between the instances continuously, because as soon as the game is closed it is removed from the list. We can then run e.g. [`tcpdump`][] and see if we can detect any of these packets over our network interface. [tcpdump]: https://www.tcpdump.org/ ```cmdline $ tcpdump listening on enp9s0, link-type EN10MB (Ethernet), snapshot length 262144 bytes 14:42:58.791069 IP 0.0.0.1.xbox > 255.255.255.255.xbox: UDP, length 60 14:42:58.801698 IP 0.0.0.1.xbox > 255.255.255.255.xbox: UDP, length 324 14:42:58.919744 IP erlite.domain > igelkott.49337: 31319* 1/0/0 PTR igelkott. (65) 14:43:00.531704 ARP, Request who-has narwhal tell igelkott, length 28 14:43:00.808959 IP 0.0.0.1.xbox > 255.255.255.255.xbox: UDP, length 60 14:43:00.810704 IP 0.0.0.1.xbox > 255.255.255.255.xbox: UDP, length 324 ``` Besides some random traffic on my LAN, we can see that there are some [UDP][udp] packets that go via a port that `tcpdump` has actually labeled as "xbox"[^xbox-user-port]. We can use the [BPF][] expression `udp port xbox` to filter by this UDP port, and ignore all other traffic. In order to normalize the timestamps to be offset from the first packet, we will add the `-ttttt` option. [^xbox-user-port]: Port 3074 is a [registered port][iana-ports] assigned by the [IANA][]. [udp]: https://en.wikipedia.org/wiki/User_Datagram_Protocol [iana-ports]: https://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.txt [IANA]: https://en.wikipedia.org/wiki/Internet_Assigned_Numbers_Authority Interestingly, the Xbox consoles always use `0.0.0.1` as the source [IP address][ipaddr] and they send broadcast messages to `255.255.255.255`. They seem to address each other by MAC addresses only, we therefore add the `-e` option to display the MAC addresses in order to know which of the two machines each packet is sent from. [ipaddr]: https://en.wikipedia.org/wiki/IP_address In order to improve readability, we will strip some redundant information and replace the MAC addresses of the machines by the names of the consoles used by the game. My console happens to be called "Howard" and is the host of the game while the emulator is called "Ghost" which is a guest that connects to the host's game. Our output should then be easier to read: ```text 00.000000 Ghost > ff:ff:ff:ff:ff:ff, len 60 00.015546 Howard > ff:ff:ff:ff:ff:ff, len 324 02.001024 Ghost > ff:ff:ff:ff:ff:ff, len 60 02.024528 Howard > ff:ff:ff:ff:ff:ff, len 324 04.040860 Ghost > ff:ff:ff:ff:ff:ff, len 60 04.074594 Howard > ff:ff:ff:ff:ff:ff, len 324 ``` [BPF]: https://en.wikipedia.org/wiki/Berkeley_Packet_Filter Now we can see that the guest searches for existing games with a 60-byte packet every 2 seconds, to which the game host quickly responds with a 324-byte packet. If we then join the game, we initially get a rapid series of exchanged packets: ```text 00.000000 Ghost > Howard, len 164 00.013472 Howard > Ghost, len 164 00.023361 Ghost > Howard, len 48 00.023476 Howard > Ghost, len 48 00.024722 Ghost > Howard, len 44 00.025583 Ghost > Howard, len 140 00.096030 Howard > Ghost, len 60 00.096288 Howard > Ghost, len 1132 00.121319 Ghost > Howard, len 124 00.135767 Howard > Ghost, len 44 00.137273 Howard > Ghost, len 1132 00.155669 Ghost > Howard, len 44 00.160945 Ghost > Howard, len 92 00.161371 Ghost > Howard, len 316 00.178269 Howard > Ghost, len 1132 00.200977 Ghost > Howard, len 92 00.336793 Howard > Ghost, len 44 00.341981 Howard > Ghost, len 60 ``` ``` ENLISTED PLAYERS ------------------------------------------------------------------- ______________________________________________________ |Game Start in -:--____________________________________| __________________________ __________________________ / Ghost \/ Howard ___ ___ \ | _______ || _______ (o1:) ( ? ) | | />__X__<\ || />__X__<\ V_V V_V | | ___ |_oo_:_oo_| || |_oo_:_oo_| ( ? ) ( ? ) | | (ox:) |\______________V_V____V_V__/ | V-V New 001 |/ ___ ___ \ | ___ || _______ ( ? ) ( ? ) | | ( ? ) || />__?__<\ V_V V_V | | V-V || |_oo_:_oo_| ( ? ) ( ? ) | | ___ |\______________V_V____V_V__/ | ( ? ) |/ ___ ___ \ | V-V || _______ ( ? ) ( ? ) | | ___ || />__?__<\ V_V V_V | | ( ? ) || |_oo_:_oo_| ( ? ) ( ? ) | \__V_V_____________________/\______________V_V____V_V__/ ------------------------------------------------------------------- X = DELAY GAME B = BACK A = START GAME ``` While we are connected to the pre-game lobby (and not pressing anything), a 316-byte packet with a 44-byte response is sent by the guest every second. The host also sends a 60-byte packet with a 44-byte response every 5 seconds: ```text 00.000000 Ghost > Howard, len 316 00.064942 Howard > Ghost, len 44 00.426135 Howard > Ghost, len 60 00.599351 Ghost > Howard, len 44 01.040111 Ghost > Howard, len 316 01.068941 Howard > Ghost, len 44 02.080000 Ghost > Howard, len 316 02.274953 Howard > Ghost, len 44 03.119122 Ghost > Howard, len 316 03.278949 Howard > Ghost, len 44 04.159019 Ghost > Howard, len 316 04.283958 Howard > Ghost, len 44 05.199096 Ghost > Howard, len 316 05.287955 Howard > Ghost, len 44 05.428169 Howard > Ghost, len 60 ``` When starting the game we get an initial exchange of packets: ```text 00.000000 Howard > Ghost, len 60 00.022185 Ghost > Howard, len 44 00.263800 Ghost > Howard, len 316 00.265740 Howard > Ghost, len 44 00.655155 Howard > Ghost, len 1132 00.655155 Howard > Ghost, len 60 00.669542 Ghost > Howard, len 60 00.868764 Howard > Ghost, len 44 01.067857 Howard > Ghost, len 132 01.108929 Howard > Ghost, len 132 01.144205 Ghost > Howard, len 52 01.149912 Howard > Ghost, len 132 01.185236 Ghost > Howard, len 84 01.225189 Ghost > Howard, len 44 01.225248 Ghost > Howard, len 84 (x 13) 01.759222 Ghost > Howard, len 84 01.785473 Howard > Ghost, len 132 (x 14) 01.786040 Howard > Ghost, len 132 ``` ``` ________ ____ _____________ /008_x_A//@_4/ V\__________ \ >====> >====> +\_\_\_\_\_\__\ _,,- ,-'' | _,, =o=< |,,-''___ __ __ \|' / /____ / \,--'''--,/ \ /\ ____| |_____ | |__ __| | ___ ____----- / /______ / ,, ''' ,, \ / = \ / /| | |_______ / | | | | \ |--=--| | <||||| | |________ '-|__|_________|__|-' \_=_/ \___ \| | |________ ---_____| |_______ ,---,__ _\____\______ __/ ,--,__'-\___---,_____ ,-' ''-, '--| ---,__'-- _/ /\\_| \ '-- _,----' _-'\\/ |_test_ ___,----' \/ | | '--,__ ,-' ,-' |_| ``` After that we get repeating cycles of packets again: ```text 00.000000 Ghost > Howard, len 84 00.001678 Howard > Ghost, len 132 00.009963 Ghost > Howard, len 44 00.041000 Ghost > Howard, len 84 00.042677 Howard > Ghost, len 132 00.082002 Ghost > Howard, len 84 00.083680 Howard > Ghost, len 132 00.123003 Ghost > Howard, len 84 00.124683 Howard > Ghost, len 132 00.163999 Ghost > Howard, len 84 00.165695 Howard > Ghost, len 132 00.205000 Ghost > Howard, len 84 00.206679 Howard > Ghost, len 132 00.210963 Ghost > Howard, len 44 00.246005 Ghost > Howard, len 84 00.247677 Howard > Ghost, len 132 00.247715 Howard > Ghost, len 132 ``` The most frequent packet is 84 bytes, it is sent by the guest every 41 ms and receives a 132-byte response. This is close to the 25 frames per second that PAL Halo runs at. Sometimes, another 132-byte packet is sent by the host immediately after the first response. Every 200 ms a 44-byte packet is also sent by the guest with no response. So, some form of game data is sent between the guest and the host every frame of the game. Now we just need to inspect those packets and see if we can understand what information they contain, right? We can supply the `-X` flag to `tcpdump` in order to show the packet data in hexadecimal and ASCII. At first we listen during the game search as those packets should be mostly static and perhaps contain the name of the host in plain text. Let us try it out: ```cmdline $ tcpdump -e -ttttt -X "udp port xbox" 00.000000 Ghost > ff:ff:ff:ff:ff:ff, len 60 0x0000: 4500 0058 13d1 0000 4011 66c4 0000 0001 E..X....@.f..... 0x0010: ffff ffff 0c02 0c02 0044 2ed5 ffff ffff .........D...... 0x0020: ffff ffff 8246 0228 a44a 9087 dada fddd .....F.(.J...... 0x0030: 8513 eeab 4048 fe18 d09a ec97 d161 ea37 ....@H.......a.7 0x0040: 7459 d31a 8633 ecc0 2cbc f43f 1635 4c17 tY...3..,..?.5L. 0x0050: 1466 f586 e72a ccdc .f...*.. 00.005401 Howard > ff:ff:ff:ff:ff:ff, len 324 0x0000: 4500 0160 c9fb 0000 4011 af91 0000 0001 E..`....@....... 0x0010: ffff ffff 0c02 0c02 014c 888f ffff ffff .........L...... 0x0020: ffff ffff a64c 9b44 9a6e 8a15 537d 10e2 .....L.D.n..S}.. 0x0030: ccd0 8f84 8d12 f572 41c1 4d05 db1e b765 .......rA.M....e 0x0040: 8f47 fb39 6f32 c357 7ea6 a551 2c21 f632 .G.9o2.W~..Q,!.2 : 0x0150: 18a7 0f7c f970 b787 b371 6c60 1260 5825 ...|.p...ql`.`X% 02.040201 Ghost > ff:ff:ff:ff:ff:ff, len 60 0x0000: 4500 0058 14d1 0000 4011 65c4 0000 0001 E..X....@.e..... 0x0010: ffff ffff 0c02 0c02 0044 8e15 ffff ffff .........D...... 0x0020: ffff ffff 64eb c15c 4dd1 58da 4160 2309 ....d..\M.X.A`#. 0x0030: c9a9 0425 87ef dcd7 48a0 cf1e 826f 71de ...%....H....oq. 0x0040: d626 8560 ec82 4199 cb8f 7ae3 fe4e 8a4d .&.`..A...z..N.M 0x0050: 5f8a 80c1 2fa9 80a3 _.../... 02.055399 Howard > ff:ff:ff:ff:ff:ff, len 324 0x0000: 4500 0160 cefb 0000 4011 aa91 0000 0001 E..`....@....... 0x0010: ffff ffff 0c02 0c02 014c 7062 ffff ffff .........Lpb.... 0x0020: ffff ffff 7e37 59fa e1f3 0c03 618f f89c ....~7Y.....a... 0x0030: 9fb7 8a3d 7edf 4d1c c042 cb39 5c0a c8e0 ...=~.M..B.9\... 0x0040: e22d d069 1ed6 4cfc 555f 6412 ce2b e2f1 .-.i..L.U_d..+.. : 0x0150: b0df bc69 a6cc 6e82 99f9 e032 82cb 19e0 ...i..n....2.... ``` Unfortunately, these packets do not contain any readable strings, nor is the content static between packets. The data shown by tcpdump shows the whole packet except for the Ethernet header, we can annotate one of the packets as per the [IP][iphdr] and [UDP][udphdr] RFCs: ```text /* IPv4 header */ 00: 45 /* version = ipv4, header length = 5*4 = 0x14 bytes */ 01: 00 /* type of service = routine */ 02: 00 58 /* IP total length */ 04: 13 d1 /* identification */ 06: 00 00 /* flags, fragment offset */ 08: 40 /* time to live */ 09: 11 /* IP protocol = UDP */ 0a: 66 c4 /* IP header checksum */ 0c: 00 00 00 01 /* IP source address */ 10: ff ff ff ff /* IP destination address */ /* UDP header */ 14,00: 0c 02 /* UDP source port = 3074 */ 16,02: 0c 02 /* UDP destination port = 3074 */ 18,04: 00 44 /* UDP total length */ 1a,06: 2e d5 /* checksum */ /* data */ 1c,08,00: ff ff ff ff ff ff ff ff 82 46 02 28 a4 4a 90 87 2c,18,10: da da fd dd 85 13 ee ab 40 48 fe 18 d0 9a ec 97 3c,28,20: d1 61 ea 37 74 59 d3 1a 86 33 ec c0 2c bc f4 3f 4c,38,30: 16 35 4c 17 14 66 f5 86 e7 2a cc dc ``` The actual data always starts with eight `0xff` bytes for these packets, the remaining looks completely random, presumably encrypted. How come the packets are encrypted? Is the System Link traffic over LAN really that critical? Oh well, now I am even more curious to see what it contains. [iphdr]: https://www.rfc-editor.org/rfc/rfc791#section-3.1 [udphdr]: https://www.rfc-editor.org/rfc/rfc768 ## Watching from inside Even though we are not able to read the contents of the packets from outside, the emulator must be able to read it as it has clearly managed to connect to and play with the Xbox console. If we were able to look inside the emulator's memory, we should be able to inspect the decrypted packet data. Like with QEMU, we can with xemu easily attach a debugger to the guest program. We simply provide `-gdb tcp::1234` when starting xemu to create a [GDB][] server that listens on the TCP port 1234. We can then connect to it by running ```gdbprompt (gdb) target remote :1234 ``` in a GDB console. [GDB]: https://en.wikipedia.org/wiki/GNU_Debugger However, it is not entirely trivial to find the location of a message in memory just as it has been decrypted. But if we were to know where the decryption occurs in the program, we could place a [breakpoint][] there and read the memory for the message right after it has been decrypted (or before it has been encrypted). [breakpoint]: https://en.wikipedia.org/wiki/Breakpoint I suspect that the cryptography stack is not specific to the Halo game, probably all Xbox multiplayer games use similar type of encryption. It would then seem plausible that the crypto code is located in a shared place, such as for example the Xbox kernel. The [XboxDevWiki][xboxdev] has a useful [wiki page][xboxkrnl] about the kernel, it has a list of functions and variables that it exports. Some of these exports contain nice words like "crypt" and "key". There is a group of exports prefixed with "Xc" that are related to cryptography and some sort of LAN key: [xboxdev]: https://xboxdevwiki.net/Main_Page | export | ordinal | type | |------------------------|---------|----------| | XcSHAInit | 335 | stdcall | | XcSHAUpdate | 336 | stdcall | | XcSHAFinal | 337 | stdcall | | XcRC4Key | 338 | stdcall | | XcRC4Crypt | 339 | stdcall | | XcHMAC | 340 | stdcall | | XcModExp | 345 | stdcall | | XcDESKeyParity | 346 | stdcall | | XcKeyTable | 347 | stdcall | | XcBlockCrypt | 348 | stdcall | | XcBlockCryptCBC | 349 | stdcall | | XboxLANKey | 353 | variable | Searching for these on the web leads to a [header file][xboxkrnl.h] in the [nxdk][] project, an open source SDK for the Xbox. It lists the prototype declarations of all exported kernel functions, including the `XcBlockCryptCBC` function which looks promising: ```c XBAPI VOID NTAPI XcBlockCryptCBC ( IN ULONG dwCipher, IN ULONG dwInputLength, OUT PUCHAR pbOutput, IN PUCHAR pbInput, IN PUCHAR pbKeyTable, IN ULONG dwOp, IN PUCHAR pbFeedback ); ``` [xboxkrnl.h]: https://github.com/XboxDev/nxdk/blob/master/lib/xboxkrnl/xboxkrnl.h [nxdk]: https://github.com/XboxDev/nxdk As mentioned by the table of exports, the functions use the [_stdcall_][] calling convention where all arguments are pushed to the stack in _right-to-left_ order. If we can place a breakpoint when entering the `XcBlockCryptCBC` function, we can read the input and output arguments from the stack in order to inspect both the encrypted and decrypted messages. Unfortunately we do not have a symbol table so we do not know where this function is located in the program memory. Actually, we do not even have the executable, the kernel is compressed and encrypted within the BIOS ROM. [stdcall]: https://en.wikipedia.org/wiki/X86_calling_conventions#stdcall Again, the emulator must be able to read it. The [kernel wiki page][xboxkrnl] mentions that the kernel base address is `0x8001_0000`. It might then be ready and loaded at that address already. If so, we can dump that region to a file using GDB. We need an end address so we will go ahead with, say 1 MiB from the base address: ```gdbprompt (gdb) dump binary memory xboxkrnl.exe 0x80010000 0x80110000 Cannot access memory at address 0x8003d000 ``` Oh, okay, I guess that is where it ends then, making it 180 KiB[^gdb-mmap]. [^gdb-mmap]: Is it possible to simply list the currently memory mapped regions of a remote GDB target instead of determining them with trial and error like this? E.g. like `info proc mem` for a native process. ```gdbprompt (gdb) dump binary memory xboxkrnl.exe 0x80010000 0x8003d000 ``` We can now try to use e.g. `objdump` to see if it can recognize the executable type and is able to find the exports: ```cmdline $ objdump -x xboxkrnl.exe xboxkrnl.exe: file format coff-i386 : Magic 010b (PE32) : Export Table: DLL name: xboxkrnl.exe Ordinal base: 1 Ordinal RVA Name 1 0x21281 : 335 0x20bcd /* XcSHAInit */ 336 0x20bd3 /* XcSHAUpdate */ 337 0x20bd9 /* XcSHAFinal */ 338 0x20bdf /* XcRC4Key */ 339 0x20be5 /* XcRC4Crypt */ 340 0x20beb /* XcHMAC */ : 345 0x20c25 /* XcModExp */ 346 0x20c41 /* XcDESKeyParity */ 347 0x20c47 /* XcKeyTable */ 348 0x20c4d /* XcBlockCrypt */ 349 0x20c69 /* XcBlockCryptCBC */ : 353 0x2bf08 /* XboxLANKey */ : 366 0x6193 ``` It can indeed! The file format is detected as [PE32][], which is an executable format also used by Windows. The kernel has 364 exports, from ordinal 1 to 366 but with 7 and 11 missing. The `XcBlockCryptCBC` function has ordinal 349 so it should be located at `0x8001_0000` + `0x0002_0c69` = `0x8003_0c69` in memory. We can place a breakpoint at this address while the host has created a game and the guest is searching: ```gdbprompt (gdb) break *0x80030c69 Breakpoint 1 at 0x80030c69 (gdb) continue Continuing. Breakpoint 1, 0x80030c69 in ?? () ``` [PE32]: https://en.wikipedia.org/wiki/Portable_Executable Examining the stack we can inspect the arguments: ```gdbprompt (gdb) x/8xw $sp 0xd00f38cc: 0x001b3fb4 0x00000001 0x00000020 0xd00086a8 0xd00f38dc: 0xd00086a8 0xd00f38f0 0x00000001 0xd00f3a70 ``` The only value that can reasonably be the input length is the third (`0x20`), so I assume that the first value is the return address pushed by the x86 [`call`][] instruction and the arguments start from the second word, giving us the following set of values for the arguments: ```c dwCipher = 1 dwInputLength = 0x20 /* = 32 */ pbOutput = 0xd00086a8 /* below the stack */ pbInput = 0xd00086a8 /* below the stack */ pbKeyTable = 0xd00f38f0 /* on the stack */ dwOp = 1 pbFeedback = 0xd00f3a70 /* on the stack */ ``` [call]: https://www.felixcloutier.com/x86/call Since the length is only 32 bytes, this might be the initial broadcast message from the guest and it is about to encrypt it. We can dump it to a file and read it with hexdump directly from GDB: ```gdbprompt (gdb) dump binary memory out 0xd00086a8 0xd00086a8+0x20 (gdb) !hexdump -C out 00000000 14 21 14 20 00 18 00 00 01 0c 01 14 21 00 01 1f |.!. ........!...| 00000010 36 cb 38 ae 61 da a6 00 01 02 03 04 05 06 06 11 |6.8.a...........| 00000020 ``` It contains no plain text data but it is clearly not encrypted or random data. Let us continue and see if we can find the response: ```gdbprompt (gdb) continue Continuing. Breakpoint 1, 0x80030c69 in ?? () (gdb) x/8xw $sp 0x8003a3e8: 0x001b3fb4 0x00000001 0x00000130 0x8200c83c 0x8003a3f8: 0x8200c83c 0x8003a40c 0x00000000 0x8003a58c ``` ```c dwCipher = 1 dwInputLength = 0x130 = 304 pbOutput = 0x8200c83c /* kernel memory? */ pbInput = 0x8200c83c /* same as output */ pbKeyTable = 0x8003a40c /* within loaded kernel image */ dwOp = 0 pbFeedback = 0x8003a58c /* within loaded kernel image */ ``` Now, the `dwinputLength` is 304 bytes which means that this may be the response from the host. The `dwOp` argument is set to 0 which might specify decryption while 1 specifies encryption. We can run until we reach the return address and dump the buffer after it has hopefully been decrypted: ```gdbprompt (gdb) until *0x001b3fb4 0x001b3fb4 in ?? () (gdb) dump binary memory out 0x8200c83c 0x8200c83c+0x130 (gdb) !hexdump -C out 00000000 14 20 14 21 01 20 00 00 |11 8c|01|1f 36 cb 38 ae |. .!. ......6.8.| 00000010 61 da a6|53 0c 32 bb ba 20 e8 58|6b d2 9c c4 5a |a..S.2.. .Xk...Z| 00000020 78 b6 2f|3a 85 08 da 03 5a cb 8e 31 b3 ac 57 49 |x./:....Z..1..WI| 00000030 9c d9 35|0c 00|00 12 5a 23 23 83|00 00 00 00 20 |..5....Z##..... | 00000040 14 01 00 00 00 48|00 6f 00 77 00 61 00 72 00 64 |.....H.o.w.a.r.d| 00000050 00 00 00 00 00 00|00 00 00 00 00 00 00 00 00 00 |................| * 00000080 00 00 00 6c 65 76 65 6c 73 5c 74 65 73 74 5c 62 |...levels\test\b| 00000090 6c 6f 6f 64 67 75 6c 63 68 5c 62 6c 6f 6f 64 67 |loodgulch\bloodg| 000000a0 75 6c 63 68 00 00 00 00 00 00 00 00 00 00 00 00 |ulch............| 000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000100 00 00 00 02 00 01 00 01 00 10 00 19 00 02 00 3e |...............>| 00000110 21 3e 5e 61 67 65 20 69 6e 20 61 20 62 6f 74 02 |!>^age in a bot.| 00000120 01 02 03 04 05 06 06 11 96 d7 91 dc 81 8e 92 6e |...............n| 00000130 ``` Oh, now we have some plain text data! The name of my Xbox "Howard" is there (seemingly, in [UCS-2][] encoding). The message also seems to contain the file path to the selected map and some odd "age in a bot" string? [UCS-2]: https://en.wikipedia.org/wiki/UCS-2 ## Decrypting the traffic Now we could potentially just go ahead and try to analyze the messages using GDB, but it is quite awkward and it also interferes with the gameplay as it adds some delay. We will therefore do some more investigating to see if we can find a way to decrypt the traffic from a separate program without affecting the execution of the emulator. We know that the encryption and decryption happens in the `XcBlockCryptCBC` function. We are not entirely sure what kind of cipher is used, but from the name we can guess that it is a [block cipher][block-cipher] with [cipher block chaining][cbc] (CBC). The `Xc` kernel module also has a function by the name `XcDESKeyParity`, so perhaps the [DES][] cipher with CBC is used. The `keyTable` argument might be the subkeys used to decrypt each block. DES uses an 8-byte key (56 key bits, 8 parity bits) and an [initialization vector][] (IV) that matches the block size (8 bytes). There is also a successor to DES called [TripleDES][] that instead takes three DES keys, i.e. a 24-byte key. [block-cipher]: https://en.wikipedia.org/wiki/Block_cipher [cbc]: https://en.wikipedia.org/wiki/Cipher_block_chaining [DES]: https://en.wikipedia.org/wiki/Data_encryption_standard [initialization vector]: https://en.wikipedia.org/wiki/Initialization_vector [TripleDES]: https://en.wikipedia.org/wiki/Triple_des We can see if we can find some higher level function that calls the `XcBlockCryptCBC` function. We can break at the entry of `XcBlockCryptCBC` again and look at the stack: ```gdbprompt (gdb) break *0x80030c69 Breakpoint 1 at 0x80030c69 (gdb) continue Continuing. Breakpoint 1, 0x80030c69 in ?? () (gdb) x /8wx $sp 0xd00f38cc: 0x001b3fb4 0x00000001 0x00000020 0xd00085a8 0xd00f38dc: 0xd00085a8 0xd00f38f0 0x00000001 0xd00f3a70 ``` Again, we have a return address as the last element in the stack, and it is the same as last time: `0x001b3fb4`. The function was presumably entered by using the `call` instruction which is 5 bytes long, so we can see if it lies just above the return address: ```gdbprompt (gdb) x /5i 0x001b3fb4-5 0x1b3faf: call 0x19346 0x1b3fb4: pop %esi 0x1b3fb5: leave 0x1b3fb6: ret $0x18 0x1b3fb9: movzbl 0xe(%ecx),%edx ``` It does, and it is calling a function located at `0x19345`. That function is actually located in the game binary, rather than in the kernel. But at `0x19345` is a jump to the kernel. The function that we see the instructions for here ends right after calling `XcBlockCryptCBC`. We can check what the whole function looks like: ```gdbprompt (gdb) (gdb) x /31i 0x001b3fb4-80 0x1b3f64: push %ebp 0x1b3f65: mov %esp,%ebp 0x1b3f67: sub $0x188,%esp 0x1b3f6d: xor %eax,%eax 0x1b3f6f: cmpl $0x8,0x10(%ebp) ; arg3_10h == 8? 0x1b3f73: push %esi 0x1b3f74: push 0xc(%ebp) ; pbKey <- arg2_ch 0x1b3f77: setne %al 0x1b3f7a: mov %eax,%esi 0x1b3f7c: lea -0x188(%ebp),%eax 0x1b3f82: push %eax ; pbKeyTable 0x1b3f83: push %esi ; dwCipher 0x1b3f84: call 0x1934c ; XcKeyTable 0x1b3f89: mov 0x14(%ebp),%eax ; arg4_14h 0x1b3f8c: mov (%eax),%ecx 0x1b3f8e: mov 0x4(%eax),%eax 0x1b3f91: mov %eax,-0x4(%ebp) 0x1b3f94: lea -0x8(%ebp),%eax 0x1b3f97: push %eax ; pbFeedback <- arg4_14h 0x1b3f98: push 0x8(%ebp) ; dwOp <- arg1_8h 0x1b3f9b: lea -0x188(%ebp),%eax 0x1b3fa1: push %eax ; pbKeyTable 0x1b3fa2: push 0x18(%ebp) ; pbInput <- arg5_18h 0x1b3fa5: mov %ecx,-0x8(%ebp) 0x1b3fa8: push 0x18(%ebp) ; pbOutput <- arg5_18h 0x1b3fab: push 0x1c(%ebp) ; dwInputLength <- arg6_1ch 0x1b3fae: push %esi ; dwCipher 0x1b3faf: call 0x19346 ; XcBlockCryptCBC 0x1b3fb4: pop %esi 0x1b3fb5: leave 0x1b3fb6: ret $0x18 ``` This function uses 6 arguments. From the call to `XcBlockCryptCBC` we can immediately tell that - argument 5 is an in-place buffer used for both input and output, - argument 6 is the length of the buffer data, - argument 1 determines if it is an encryption (=1) or decryption (=0). The earlier call is to `XcKeyTable` which might generate the subkeys from the input key, i.e. key schedule. We can also check its signature from the `xboxkrnl.h` header. It takes argument 2 for its `pbKey` argument which means that - argument 2 is the encryption/decryption key. Let us try to place a breakpoint at the function entry and look at the stack to see what arguments it receives the next time we enter it: ```gdbprompt (gdb) break *0x1b3f64 Breakpoint 2 at 0x1b3f64 (gdb) continue Continuing. Breakpoint 2, 0x001b3f64 in ?? () (gdb) x/7wx $sp 0x8003a598: 0x001b5400 0x00000000 0xd0042cac 0x00000018 0x8003a5a8: 0x8200d034 0x8200d03c 0x00000130 ``` The third argument seems to be a length, it is currently `0x18` = 24. Interestingly, if the third argument is 8, `dwCipher` is set to 0, otherwise to 1. This might mean that - argument 3 is the key length, and `dwCipher=0` specifies the DES cipher while `dwCipher=1` specifies the TripleDES cipher. That leaves us with only argument 4 remaining, which is used as the `pwFeedback` argument to `XcBlockCryptCBC`. As mentioned earlier, DES also takes an IV as input, so - argument 4 might be the IV. We can try to write the values for each argument and check the buffer before and after the function has executed using a GDB command: ```gdb define cbc_crypt set $encrypt = *(int*)($sp+0x4) set $key = *(char**)($sp+0x8) set $key_len = *(int*)($sp+0xc) set $iv = *(char**)($sp+0x10) set $buf = *(char**)($sp+0x14) set $buf_len = *(int*)($sp+0x18) printf "key @ %p+%d:\n", $key, $key_len dump binary memory key $key $key+$key_len !hexdump -vC key printf "iv @ %p+%d:\n", $iv, 8 dump binary memory iv $iv $iv+8 !hexdump -vC iv printf "input @ %p+%d:\n", $buf, $buf_len dump binary memory input $buf $buf+$buf_len !hexdump -vC input # run until ret until *$sp if $encrypt == 1 printf "encrypted:\n" else printf "decrypted:\n" end dump binary memory out $buf $buf+$buf_len !hexdump -vC out end ``` Running it yields: ```gdbprompt (gdb) cbc_crypt key @ 0xd0042cac+24: 00000000 c2 b0 31 10 79 2a 97 e5 91 bf ec 52 ce 80 fe f7 |..1.y*.....R....| 00000010 ad 19 80 c2 89 38 01 62 |.....8.b| 00000018 iv @ 0x8200d034+8: 00000000 57 9a 33 61 d1 66 7d aa |W.3a.f}.| 00000008 input @ 0x8200d03c+304: 00000000 bc 01 19 e6 1f cf 0f 49 72 f8 10 6a 48 b0 09 b3 |.......Ir..jH...| 00000010 d8 06 c7 f7 77 82 ed 03 91 82 8c 05 54 4b 74 26 |....w.......TKt&| 00000020 45 95 1e 45 fd 73 65 80 70 fd 9d f2 5a 63 6f ea |E..E.se.p...Zco.| 00000030 b5 ab d2 0f a7 05 40 34 9d a0 31 86 09 34 b5 2b |......@4..1..4.+| 00000040 74 eb 3b b6 7e 3f cf 2c b8 1d e7 54 86 56 90 2a |t.;.~?.,...T.V.*| 00000050 79 eb 88 91 5a 85 eb 84 c9 45 bf 5b cb c4 f1 5a |y...Z....E.[...Z| 00000060 9c ff 1f 12 3d 78 c6 a8 ee 8c dc da c8 4e 59 19 |....=x.......NY.| 00000070 53 3b ba 3e 8f 10 32 95 d0 ab 4f f7 10 64 fe 3f |S;.>..2...O..d.?| 00000080 26 92 a8 c8 cd 66 8f cc ba d0 9e 22 69 c7 dc 4b |&....f....."i..K| 00000090 c6 ec 2a d3 80 50 38 e5 1e 08 76 21 87 78 56 b1 |..*..P8...v!.xV.| 000000a0 7b 85 a6 c4 2a 0e e8 a5 5b 29 c3 39 58 84 a9 bd |{...*...[).9X...| 000000b0 06 67 fe c4 fa e4 7b 03 63 ca 9c 5d 55 6d 15 10 |.g....{.c..]Um..| 000000c0 3c bf 25 c7 b9 4c 7a d2 7c 04 05 78 fd 3c 91 c7 |<.%..Lz.|..x.<..| 000000d0 18 67 a7 0e 3d d6 a9 aa 31 68 d7 bb 10 eb 95 c7 |.g..=...1h......| 000000e0 ca 0e e0 fa 1e 74 2f 5a 68 a4 45 3a 7a 7c 99 da |.....t/Zh.E:z|..| 000000f0 79 0e 30 2a 36 13 a4 fe a3 87 a6 78 7c e6 70 5f |y.0*6......x|.p_| 00000100 8e c1 96 d7 7e 5f c9 8f 56 8d 1b 2d ed 87 75 f6 |....~_..V..-..u.| 00000110 4d 18 5b 64 03 6e e9 be 9b 2a 67 aa 04 c8 77 69 |M.[d.n...*g...wi| 00000120 08 a5 e3 af fe f4 9b b8 0c 70 77 2b 00 79 dc a5 |.........pw+.y..| 00000130 0x001b3fb6 in ?? () decrypted: 00000000 14 20 14 21 01 20 00 00 11 8c 01 40 b6 43 66 a7 |. .!. .....@.Cf.| 00000010 b5 22 db ed 67 79 42 41 96 0c bd 64 96 aa 34 72 |."..gyBA...d..4r| 00000020 ca 5e 97 3d e7 7a e4 79 35 18 53 b3 59 8d 66 5c |.^.=.z.y5.S.Y.f\| 00000030 c7 ed 84|0c 00|00 12 5a 23 23 83|00 00 00 00 20 |.......Z##..... | 00000040 14 01 00 00|00 48 00 6f 00 77 00 61 00 72 00 64 |.....H.o.w.a.r.d| 00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000060 00 00 00 00|00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000070 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000080 00 00 00|6c 65 76 65 6c 73 5c 74 65 73 74 5c 62 |...levels\test\b| 00000090 6c 6f 6f 64 67 75 6c 63 68 5c 62 6c 6f 6f 64 67 |loodgulch\bloodg| 000000a0 75 6c 63 68 00 00 00 00 00 00 00 00 00 00 00 00 |ulch............| 000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000000c0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000000d0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000000e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000000f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000100 00 00|00 02|00 01|00 01| 00 10|00 19|00 02|00|3e |...............>| 00000110 21 3e 5e|61 67 65 20 69 6e 20 61 20 62 6f 74|02 |!>^age in a bot.| 00000120 01 02 03 04 05 06 06 11 50 16 3e 60 ac ff 0d d0 |........P.>`....| 00000130 ``` We can use OpenSSL to verify if it is actually using TripleDES CBC. After some trial and error, I managed to find a set of arguments that successfully decrypts the message with the above key and IV: ```cmdline $ openssl des-ede3-cbc -d -nopad \ -K $(hexdump -ve '/1 "%02x"' key) \ -iv $(hexdump -ve '/1 "%02x"' iv) \ -in input \ | hexdump -C 00000000 14 20 14 21 01 20 00 00 11 8c 01 40 b6 43 66 a7 |. .!. .....@.Cf.| 00000010 b5 22 db ed 67 79 42 41 96 0c bd 64 96 aa 34 72 |."..gyBA...d..4r| 00000020 ca 5e 97 3d e7 7a e4 79 35 18 53 b3 59 8d 66 5c |.^.=.z.y5.S.Y.f\| 00000030 c7 ed 84 0c 00 00 12 5a 23 23 83 00 00 00 00 20 |.......Z##..... | 00000040 14 01 00 00 00 48 00 6f 00 77 00 61 00 72 00 64 |.....H.o.w.a.r.d| 00000050 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000080 00 00 00 6c 65 76 65 6c 73 5c 74 65 73 74 5c 62 |...levels\test\b| 00000090 6c 6f 6f 64 67 75 6c 63 68 5c 62 6c 6f 6f 64 67 |loodgulch\bloodg| 000000a0 75 6c 63 68 00 00 00 00 00 00 00 00 00 00 00 00 |ulch............| 000000b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000100 00 00 00 02 00 01 00 01 00 10 00 19 00 02 00 3e |...............>| 00000110 21 3e 5e 61 67 65 20 69 6e 20 61 20 62 6f 74 02 |!>^age in a bot.| 00000120 01 02 03 04 05 06 06 11 50 16 3e 60 ac ff 0d d0 |........P.>`....| 00000130 ``` Now we can try to observe the arguments for the "cbc\_crypt" function for more packets, allowing us to make some observations: - The 24-byte key is always the same, it is [symmetric][sym-enc] and is used for all broadcast messages. - After joining a game, two new 8-byte symmetric keys are used, one for messages encrypted by the host and one for messages encrypted by the guest. The same pair of keys are used until the game is closed or the guest leaves. - The crypt function is not entered for all packets, some packets are sent without encryption. - The UDP payload of _encrypted_ packets looks similar to an [IPSec ESP][] in transport mode. The payload seems to contain - A 4-byte [Security Parameter Index (SPI)][SPI]. It is set to `0xffff_ffff` for broadcasts, otherwise: - 1 byte indicating whether the packet is encrypted (=1) or not (=0), - 2 bytes to identify the security association, - 1 byte set to zero or one (unknown), - a 4-byte sequence number, set to `0xffff_ffff` for broadcasts, incremented once for each packet after a security association is established, - an 8-byte IV, used to decrypt the message, - an encrypted payload, containing - an 8-byte UDP header OR a 20-byte TCP header, - a 2-byte value that is set to `(l << 4) | 0xc` where `l` is the length of the encrypted payload minus the transport layer header and padding. - 1 byte always set to one. - a variable-length message, - a variable number of incrementing padding bytes, - 1 byte specifying pad length, - 1 byte specifying IP protocol (UDP or TCP), - a 12-byte checksum or Integrity Check Value (ICV) of the packet. [sym-enc]: https://en.wikipedia.org/wiki/Symmetric-key_algorithm [SPI]: https://en.wikipedia.org/wiki/Security_Parameter_Index [IPSec ESP]: https://en.wikipedia.org/wiki/IPsec#Encapsulating_Security_Payload Typically, ESP is used directly on top of IP but, interestingly, they have chosen to use it on top of UDP and then also use UDP or TCP packets inside the encrypted packets. A [pre-shared key][psk] is used for broadcasts and first when joining, a type of [security association][sa] is established where the two instances agree on a pair of keys to use for the remaining packets. We can also confirm that the algorithm using 8-byte keys is DES, the below openssl command successfully decrypts messages with a given key and IV: ```cmdline $ openssl des-cbc -provider legacy -d -nopad \ -K $(hexdump -ve '/1 "%02x"' key) \ -iv $(hexdump -ve '/1 "%02x"' iv) \ -in input ``` [psk]: https://en.wikipedia.org/wiki/Pre-shared_key [sa]: https://en.wikipedia.org/wiki/Security_association ## Key derivation The whole encryption stuff got me _intrigued_ and I am now curious how it is set up. If we investigate further, we might be able to create a program that can read the packets without relying on an emulator or debugger to obtain the keys. In order to more easily understand what is going on, we will run two emulators that connect to each other, each with a GDB instance attached to it. We will also use tcpdump to record the packets to a file (.pcap). Each GDB instance will run a command that sets up breakpoints at points of interest, such as when entering a system call. Whenever a breakpoint is hit, it checks the PC to determine which one was triggered, and then dumps internal state such as inputs and outputs in a readable format. We can then see if any of the inputs or outputs from the different system calls match each other or parts of the packets and therefore follow the derivation of the keys. The workflow looks like this: 1. Start two game instances, each with a separate GDB server. 2. Run a [script][monitor.sh] that: - connects a GDB client to each GDB server and loads a GDB [command][xboxkrnl.gdb], - launches tcpdump to record packets to a file, 4. Do something interesting, e.g. searching for or joining a game, 5. Inspect the traces of internal state and packet contents, 6. Update the GDB command with breakpoints or watchpoints at new locations of interest, 7. Repeat from 2. [monitor.sh]: https://git.sr.ht/~nhellman/hllmn/tree/master/item/content/blog/2023-09-18_h1x-net/src/monitor.sh [xboxkrnl.gdb]: https://git.sr.ht/~nhellman/hllmn/tree/master/item/content/blog/2023-09-18_h1x-net/src/xboxkrnl.gdb After some manual investigations like this we have mapped out how each client derives the DES keys and where each piece of data comes from. The process for each client looks something like this: ``` [0] XboxLANKey | TitleLANKey | | | ___________________ __V_____V__V_ _________________ / XcRC4Crypt (PRNG) \ / XcHMAC \ |Broadcast message| \___________________/ \_____________/ |_________________| | | | _________________ base | d (constant) key | | |Handshake payload|--+ | exp | __V____V__ |_________________| V__V_____V_ mod / XcHMAC \ | | / XcModExp \ \__________/ [4] | | [2] \__________/ | | | | | | ___V____V__V__ _____V___V___V / XcHMAC \ / XcHMAC \ \______________/ \______________/ | | _______V______ ______V_______ /XcDESKeyParity\ /XcDESKeyParity\ \______________/ \______________/ | | ______V_____ ______V______ |Host DES key| |Guest DES key| |____________| |_____________| ``` The `XcHMAC` function is used extensively. It generates a typical [hash-based message authentication code][hmac] using [SHA-1][]. It takes a 16-byte key, a variable-length input and generates a 20-byte [message authentication code][mac]. We can replicate the implementation using e.g. OpenSSL's SHA-1 implementation. [hmac]: https://en.wikipedia.org/wiki/HMAC [SHA-1]: https://en.wikipedia.org/wiki/SHA-1 [mac]: https://en.wikipedia.org/wiki/Message_authentication_code By reimplementing the system calls, most of the things in the derivation can easily be replicated by a program of ours. Most inputs are just constants, and some are from the packets. There is, however, one concerning source of data here: a [pseudo-random number generator (PRNG)][prng]. How could our program possibly have a PRNG that generates the exact same number as the game instance? Also, how can two game instances derive the same key with two different random numbers as inputs? [prng]: https://en.wikipedia.org/wiki/Pseudorandom_number_generator Looking at the traces from the debugger, the two instances do indeed have completely different exponent inputs. However, they also have different base inputs, and they end up with the exact same output. The function in question is the `XcModExp` function, which performs [modular exponentiation][mod-exp], i.e. it calculates $`a = b^c \: (\mathrm{mod}\: d)`. In this case each value is a 768-bit integer (96 bytes) so brute forcing would take _a while_. The exponent is random and the base is from one of the messages, i.e. it comes from the other client. Looking further back in the trace we can see that the clients first called `XcModExp` to generate one number each[^expbase]: $$` \begin{align} a_0 = {2^88}^{c_0} \: (\mathrm{mod}\: d) \\ a_1 = {2^88}^{c_1} \: (\mathrm{mod}\: d) \end{align} ` where $`d` is a shared constant[^fodg] found in the game binary and $`c_0`, $`c_1` are randomly generated numbers. The instances then share $`a_0` and $`a_1` with each other in the handshake packet and then calculate $$`e = {a_1}^{c_0} \: (\mathrm{mod}\: d)` and $$`e = {a_0}^{c_1} \: (\mathrm{mod}\: d)` respectively. Both end up with the same value $`e`, without ever knowing the other instance's $`c`-value. This is known as the [Diffie-Hellman key exchange][diffie-hellman]. As a bystander, we only know $`a_0`, $`a_1` and $`d`, without neither $`c_0` or $`c_1` it is very difficult for us to determine $`e`[^dhp]. But for us to be able to derive the keys, we will have to _somehow_ determine $`e`. Hmm... If we were to act as a client, we could easily mimic what the game does and get another client to generate an $`e`-value together with us. If we can do that with one of the clients, we should be able to do it with two clients simultaneously. What if we could calculate an $`e`-value for two instances, and then act as an intermediary between them... What if we could act as a... [mod-exp]: https://en.wikipedia.org/wiki/Modular_exponentiation [diffie-hellman]: https://en.wikipedia.org/wiki/Diffie_hellman [^expbase]: The $`2^88` constant is represented with 96 bytes: 95 zeros followed by a two. I initially thought it was a big-endian 2 but since the `XcModExp` uses little-endian, the value will evaluated as $`2^88`. I am not sure if $`2^88` is a better value than 2 here and if it was even intentional to use that over simply 2. [^fodg]: The $`d` constant is the prime number $`2^768-2^704-1+2^64(2^638\pi+149686)`, referred to as the "First Oakley Default Group" by [RFC 2409][rfc2409]. Although, it it is written in big-endian while `XcModExp` seems to use little-endian. [rfc2409]: https://www.rfc-editor.org/rfc/rfc2409#section-6.1 [^dhp]: This is known as the [Diffie-Hellman problem][dhp]. [dhp]: https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_problem ## Man in the middle It seems like it might be (practically) impossible for a listener to obtain the keys used between two parties who have successfully established a security association. However, it should be possible for us to create two security associations, one with each party, and then simply shuffle the messages between them by decrypting and re-encrypting each packet. It's like playing chess with two other players at the same time and just relaying their moves, effectively making them play against each other. This can be considered a [man-in-the-middle (MITM) attack][mitm]. Performing such an attack would also enable us to alter the packets without the clients knowing. [mitm]: https://en.wikipedia.org/wiki/Man-in-the-middle_attack In order to perform the attack we will write a C program that reads and listens to a network socket. We will be using a raw socket which picks up all packets, regardless of the destination IP address[^ipcap]. When running Linux, we can achieve that by opening a socket with the `AF_PACKET` address family, the `SOCK_RAW` socket type and the `ETH_P_ALL` protocol as described by [`packet(7)`][]: ```c int sockfd = socket(AF_PACKET, SOCK_RAW, htons(ETH_P_ALL)); if (sockfd == -1) { perror("socket failed"); return EXIT_FAILURE; } ``` [^ipcap]: The packets are kind of odd, they always use the same destination IP addresses for all recipients (`255.255.255.255` for broadcasts and `0.0.0.1` for unicasts). I tried to capture the packets by using UDP sockets that listen to the Xbox port. The broadcasts that were addressed to `255.255.255.255` were picked up after binding to `INADDR_ANY` (`0.0.0.0`) or `INADDR_BROADCAST` (`255.255.255.255`). However, the packets addressed to `0.0.0.1` were not, and I was also not able to bind the socket to `0.0.0.1` because that address is not considered _local_. I am not sure if it is possible to listen to addresses not assigned to my machine? Would I be able to listen to it if I managed to statically assign `0.0.0.1` to my machine? We are now able to read packets from all interfaces using [`recv(2)`][] on the socket: ```c uint8_t p[BUFSIZE]; ssize_t p_len = recv(sockfd, p, sizeof(p), 0); if (p_len == -1) return; ``` But in order to be able to send packets via the socket, we have to bind it to a specific interface. We can look up the index of a specific interface using [`ioctl(2)`][] and then bind it: ```c struct ifreq ifr; strcpy(ifr.ifr_name, net_interface); size_t ret_ifindex = ioctl(sockfd, SIOCGIFINDEX, &ifr); assert(ret_ifindex == 0); struct sockaddr_ll sll = { .sll_family = AF_PACKET, .sll_ifindex = ifr.ifr_ifindex, .sll_protocol = htons(ETH_P_ALL), }; if (bind(sockfd, (struct sockaddr *)&sll, sizeof(sll)) == -1) { perror("bind failed"); exit(EXIT_FAILURE); } ``` We can now send packets over the interface using [`send(2)`][]: ```c int send_len = send(sockfd, p, p_len, 0); if (send_len == -1) { perror("send failed"); exit(EXIT_FAILURE); } ``` [packet(7)]: https://www.man7.org/linux/man-pages/man7/packet.7.html [recv(2)]: https://www.man7.org/linux/man-pages/man2/recv.2.html [ioctl(2)]: https://www.man7.org/linux/man-pages/man2/ioctl.2.html [send(2)]: https://www.man7.org/linux/man-pages/man2/send.2.html As we saw previously, the searching guest client will send out a broadcast packet to which all hosts with an active game will reply to. If we just copy a broadcast message in full, and broadcast it ourselves, we receive replies from the hosts as well. However, if we decrypt it, modify any byte and re-encrypt it, we no longer receive a reply. Recall that there was an ICV, i.e. a checksum of the packet contents that was appended to the end of the packet. Can we alter the packet and simply recalculate the ICV? I think we can. Looking at our trace from GDB, after a packet has been encrypted and before it is sent over the network, a call to `XcHMAC` is performed with the packet contents (starting from the UDP payload) as the data and a constant value as the key. The first 12 bytes of the output then matches the 12 last bytes of the packet as seen by tcpdump. We can simply reuse that key and generate a new ICV after we have altered the packet. Now we can try to decrypt the packet, alter some of the bytes, re-encrypt it and append it with a correct ICV. Depending on what bytes we alter, we still get a reply. After some experimentation with altering various parts in the packets while inspecting the traces, the two broadcast packets seem to be structured like this[^unaligned-ingame]: ```c /* guest searching for games */ struct broadcast_g2h { struct udphdr udp; uint16_t packet_size; /* (0x10 << 4) | 0xc */ uint8_t _ah; /* 01 */ uint8_t udp_port_guest[0x2]; uint8_t _eh[0x2]; /* 00 01 */ uint8_t guest_id[0x8]; uint8_t packet_type; /* 00 */ }; /* host responding to search */ struct broadcast_h2g { struct udphdr udp; uint16_t packet_size; /* (0x118 << 4) | 0xc */ uint8_t _ah; /* 01 */ uint8_t guest_id[0x8]; uint8_t host_id[0x8]; uint8_t hmac_data0[0x8]; uint8_t hmac_data1[0x10]; uint8_t _33h[0x2]; /* 0c 00 */ uint8_t mac_src[ETH_ALEN]; uint8_t _3bh[0x9]; /* 00 00 00 00 20 14 01 00 00 */ char host[0x20]; uint8_t _58h[0x1f]; /* 00 .. */ char level_path[0x7f]; uint16_t game_rules; uint16_t _104h; /* 00 01 */ uint16_t number_of_players; uint16_t _108h; /* 00 10 */ uint16_t score_limit; uint16_t mode; uint8_t _10eh; /* 00 */ uint8_t game_status[0x4]; uint8_t age_in_a_bot[0xc]; uint8_t packet_type; /* 02 */ }; ``` [^unaligned-ingame]: Due to some fields not being aligned like a struct in C is [typically aligned][calign], we have to use `uint8_t` arrays in order to avoid padding. [calign]: https://en.wikipedia.org/wiki/Data_structure_alignment#Typical_alignment_of_C_structs_on_x86 Both packets start with a UDP header and end with a byte that specifies what type of packet it is. This byte at least seems to be constant and unique to each type of packet so we will use it as a packet type identifier, regardless of its intended meaning. At first, the guest initially sends a packet that specifies an ID for itself. The host then replies with the same guest ID, and another ID for itself. It also specifies some random data that is used in some HMAC calls on both sides to generate the keys, as seen in the derivation diagram above. The host also specifies its own MAC address so the guest knows where to send [unicast][] packets. Finally, the broadcast contains the game name and game options that we are able to see on the screen when selecting a game to join. [unicast]: https://en.wikipedia.org/wiki/Unicast Then, when the guest actually chooses to join a game, the guest client sends an unencrypted handshake unicast packet to the host MAC address to which the host immediately replies with a similar packet. The two handshake packets are structured like this: ```c struct handshake { uint8_t _0h[0xc]; /* 00 00 00 00 00 XX XX 00 00 00 00 00 */ uint8_t hmac_data1_guest[0x8]; /* randomized by guest */ uint8_t hmac_data1_host[0x8]; /* randomized by host */ uint8_t mod_exp_base[MOD_EXP_LEN]; uint8_t broadcast_hmac_data0[0x8]; uint8_t _7ch[0x2]; /* 0c 00 */ uint8_t mac_src[ETH_ALEN]; uint8_t _84h[0xb]; /* 00 00 00 00 XX XX XX e9 fc d8 d9 */ uint8_t packet_type; /* 01 */ }; ``` The guest sends some randomly generated HMAC data which is used during the key derivation, as seen in the above diagram. The `hmac_data1_host` field is just filled with zeros. The guest also includes its $`a_0` value that will be used as the base in the `XcModExp` call that we mentioned earlier. The guest also resends the HMAC data it received from the host's broadcast. It also includes its MAC address so the host can send unicast packets to it. The host then replies with a similar packet. But it writes its randomly generated HMAC data to the `hmac_data1_host` field and provides its $`a_1` value for `XcModExp`. Both clients now use the exchanged information to derive a pair of DES keys that will be used for all encrypted packets from here onwards. Similarly, they will derive a pair of keys used for the ICV for the remaining packets. So, let us try to intercept this process. We want to make the clients think that they are communicating with each other, while in reality they are both talking directly to us and we are just passing the messages along (with some small harmless alterations). Since no essential information is only present in the broadcast from the guest, we can just ignore that. But as soon as we see a broadcast reply from the host, we will immediately copy that and broadcast it with some alterations. We want the guest to send the handshake to us instead, so we will just overwrite the `mac_src` field that contains the host MAC address with the address of our own network interface. Now when we press join, the guest sends the handshake packet to our address instead and the host simply ignores it and does not respond with a handshake packet[^guest-overwrite]. The guest client then displays an error message because we did not respond. [^guest-overwrite]: Interestingly, the guest seems to take the latest packet it receives and just overwrite the earlier information it received rather than trusting the first packet and ignoring later packets. If we also choose to overwrite the host ID, we instead get two options two choose from in the game selection list. But by keeping the same host ID, we can silently act act as the original host. Next, we will simply take the handshake packet from the guest and resend it to the host (we know the host MAC address from the `mac_src` field in its broadcast that we intercepted). Because we do not know the secret $`c_0` value of the guest, we will also have to modify the $`a_0` value before we pass it to the host. Remember, the host will use it to calculate $$`e = {a_0}^{c_1} \: (\mathrm{mod}\: d).` We need to select the value of $`a_0` such that we can know what value of `e` the host will obtain. Any ideas? How about $`a_0 = 1`? Then we do not even need to implement the `XcModExp` function because we can perform the 768-bit integer calculation right here and now: $$`e = 1^{c_1} \: (\mathrm{mod}\: d) = 1.` Easy! So if we provide $`a_0=1` to the host, we know it will get $`e=1`, so we will also use that number for the key derivation. Next, we will receive a reply from the host with a handshake packet. We now have both the guest and host HMAC data so we are able to derive the DES and ICV keys to be used for the remaining packets. We can then overwrite the host's $`a_1` value in the handshake packet with 1 as well, replace the MAC address with ours and pass it along to the guest. After that we are in, when joining the game, we successfully enter the game through our MITM client! The attack can be summarized with the following sequence diagram: ``` Guest MITM Host | | | | | | | broadcast: id=g | |----------------------->|----------------------->O | | O | broadcast:|id=h,MAC=H O h.MAC=H |<-----------------------O<---------------------- O | O | | broadcast: id=h,MAC=M O | h.MAC=M O<-----------------------O | O | | O handshake: a0=X,MAC=G | | O----------------------->O | | O | | O handshake: a0,MAC=M | g.MAC=M | O----------------------->O e=1^c1=1 | | O | | handshake: a1=Y O | O<-----------------------O | O | | handshake: a1=1 O | e=1^c0=1 |<-----------------------O | | | | | | | ``` All remaining packets after the handshake can be passed through without any modifications. Each packet is sent to our client, decrypted and re-encrypted and then passed along to the other client using our derived keys. We no longer need an emulator with a debugger attached to read messages. We are now able to read and even alter packets sent between two completely unmodified consoles! ## The game network protocol The time has come, we are now finally able and willing to do what we set out to do in the first place, investigate the game network protocol! After some experimenting with pushing various buttons and trying different player counts while watching the packets in real-time, it seems like the clients more or less just send the player inputs to each other. The input for each player is contained in a 30-byte struct that looks like this: ```c struct player { uint32_t actions; float yaw; float pitch; float forward; float left; float fire_duration; int16_t selected_weapon; int16_t selected_grenade; int16_t zoom; }; ``` The `yaw` is set to the current direction that the player is facing, the `pitch` is the current pitch[^details-yaw-pitch]. `forward` is the ground acceleration in the forward direction, `left` for sideways ground acceleration with left as positive. The `fire_duration` starts counting from 0 when pressing fire and resets to zero when releasing, incrementing around 0.1 per second. [^details-yaw-pitch]: The yaw and pitch is in radians from 0 to $`2\pi` and $`\frac{-\pi}{2}` to $`\frac{\pi}{2}`, respectively. The yaw is 0 when facing east (assuming the assault rifle compass is pointing north). `selected_weapon` is the index of the currently equipped weapon. Similarly for `selected_grenade` but it is -1 when one has run out of grenades. `zoom` is -1 initially and becomes 0 for the first level of zoom and 1 for the second. These represent the current state and will not change if you e.g. try to switch grenades when you only have one type. The `actions` field contains bits for at least 9 actions: ```c enum action { CROUCH = 1 << 0x0, /* left stick */ JUMP = 1 << 0x1, /* A */ TOGGLE_FLASHLIGHT = 1 << 0x4, /* white */ RELOAD = 1 << 0x6, /* X */ MELEE = 1 << 0x7, /* B */ FIRE = 1 << 0xb, /* right trigger */ THROW_GRENADE0 = 1 << 0xc, /* left trigger */ THROW_GRENADE1 = 1 << 0xd, /* left trigger */ PICK_UP = 1 << 0xe, /* X hold */ }; ``` Each bit is high as long as its corresponding button is held down, although the `PICK_UP` only goes high after holding the "X" button for a short while. All of the actions will go high when you press their corresponding button regardless if the action can be performed or not. E.g. if you try to fire with no ammo, the `FIRE` action will still be active. These player structs are then placed in a continuous array, with one element for each player. The guest sends one struct for each player that is local to their console. The packet from the guest to the host is structured like this: ```c /* sent from each guest to the host once every tick */ struct ingame_g2h { struct udphdr udp; uint16_t packet_size; /* (0x09 + player_count * 0x1e) << 4 | 0xc */ uint8_t _10h; /* 01 */ uint8_t tick[0x4]; uint8_t player_count; struct player players[]; /* only players on guest console */ }; ``` The guest packet also contains the index of the current tick and some unknown data. It is not part of the struct definition here, but there is also a single byte after the player array that is always set to `0x19`[^flex-array]. [^flex-array]: We cannot represent the elements after the player array in the struct because the number of players varies, instead we use a C99 [flexible array member][c-flexarray] which just points to the start of the player array and ignore the byte after because it is constant anyway. [c-flexarray]: https://en.wikipedia.org/wiki/Flexible_array_member The host replies with not just the player structs for all remaining players, but *it also resends the player structs of the guest that it replies to*. The packet is structured like this: ```c /* sent from the host to every guest once every tick */ struct ingame_h2g { struct tcphdr tcp; uint16_t packet_size; /* ((0x11 + player_count * 0x1e) << 4) | 0xc */ uint8_t _16h; /* 01 */ uint8_t tick_a[0x4]; uint8_t hash[0x4]; /* XX XX XX XX */ uint8_t tick_b[0x4]; uint8_t player_count; struct player players[]; /* all players in game */ }; ``` The host packet contains two tick counters that are usually the same but sometimes go out of sync. They are typically 1 step ahead of the guest tick. The `hash` value is sometimes the same for multiple ticks, and changes more often when players are not idle. After the player array, there is one byte that is always set to `0x14`. The clients are more or less only sending inputs between each other, and are then relying on each other to simulate the game identically. It seems like there is a risk for the clients to go out of sync if any packets were to be dropped or delayed. We can notice that when e.g. momentarily minimizing the emulator, causing it to not reply in time, the game thereafter becomes unplayable and eventually disconnects. ## Altering in-game packets So what would happen if we altered any of the game information sent between the clients while playing? We can try to modify packets from the host at first, e.g. make a host player walk left when firing: ```c struct ingame_h2g *game = (struct ingame_h2g *)msg; struct player *p0 = (void *)game->players; if (ntohl(p0->actions) & FIRE) { p0->actions = htonl(ntohl(p0->actions) | CROUCH); } ``` [c-6-3-2-3-7]: https://rgambord.github.io/c99-doc/sections/6/3/2/3/index.html#p7 When the host and guest player are facing each other, and the host begins shooting, the guest sees the host shooting and walking left for a second before the screen changes to a full screen message: > ATTENTION > > Your Xbox console has gone out of sync with the other consoles in the game. > Resetting... > > Press START to Continue Then the guest attempts to "reset"[^desync-reset] but disconnects shortly thereafter. Interestingly, the clients did not just continue playing out of sync, the guest client quickly detected[^desync-how] that they were out of sync and supposedly tried to reset. [^desync-reset]: I am not sure how or if it is attempting to reset. In order to resynchronize, would the host not need to send its full game state? There are no additional packets being exchanged except for the normal ones. [^desync-how]: How does it detect the desync, though? Perhaps they are using the `hash` value that is sent by the host. The host might calculate a hash of its entire game state, if the guest client then obtains a different hash for its game state compared to the one it receives by the host, it will present the desync message and try to reset. Modifying this hash value once using our MITM client causes the desync message to flash by for a second on the guest console and make the guest players immediately leave the game. Unfortunately, this makes it difficult to create a dedicated server. If it were not for this desync detection, it might have been possible to create a simple server that simply broadcasts all player information to the guests. What would happen if we instead modify the information from the guest to the host? Remember, the host resends the player data that it receives to all guest clients, _including the guest that it received it from_. Will the host use the modified data and the guest use the original data, causing them to go out of sync like we saw previously? Or will the guest check that the data is unchanged and reject it immediately? Or will it just accept the received data and use that instead of the data that it sent? We can try it out: ```c struct ingame_g2h *game = (struct ingame_g2h *)msg; struct player *p0 = (void *)game->players; if (ntohl(p0->actions) & TOGGLE_FLASHLIGHT) { p0->forward = htonf(200 * ntohf(p0->forward)); p0->left = htonf(200 * ntohf(p0->left)); } ``` If a player from a guest client is holding the flashlight button, we will increase their speed before we send the packet off to the host. Let's see what happens when we run it and hold down the flashlight button as a guest player: We are able to run a lot faster than usual, even faster than a warthog jeep! I was really hoping for that to work and I am quite surprised it actually did! Heh, it is quite fun to fly around using the hills and valleys of Blood Gulch, sort of reminds of [Tribes][]. As you can see, it is very easy to kill oneself from fall damage, though. [Tribes]: https://en.wikipedia.org/wiki/Tribes_(video_game_series) This must mean that the guest client sends its controls to the host and assumes it gets back the same without actually checking it. The guest uses the response as the actual truth, completely ignoring what it actually sent. We are also lucky that the clients do not saturate the speed even though one can never run faster than 1.0 units. Some other things I tried with the client packets: - Changing the yaw or pitch causes the guest player camera to look one way and fire in another direction. - Using a vehicle with increased speed: the warthog and scorpion seem to have binary speed, either 0 or 1 and are not affected at all by a higher speed value. The acceleration of the ghost _does_ increase with a higher speed, but going way too fast it immediately leans forward and flips over, forcing you out of the vehicle. - Modifying the fire duration does not seem to have any effect. - Changing the selected grenade can prevent you from using grenades that you have but cannot let you use ones that you do not have. If you have a Linux machine and want to try it out yourself, the MITM client is available [here][mitm_h1.c]. Just compile it, make sure it is allowed to use raw sockets, and specify `pal`/`ntsc` and a network interface when running: ```cmdline $ cc -lcrypto mitm_h1.c -o mitm_h1 # setcap cap_net_raw=eip mitm_h1 $ ./mitm_h1 pal eth0 ``` You can also use `-DSNEAKY` when compiling to make it perform the attack silently, e.g. if you want to get the to the rocket launcher first every time at the next LAN party (people still have those, right?). Just be discrete with the laptop that is connected to the network switch and has console text quickly scrolling by. :> [mitm_h1.c]: https://git.sr.ht/~nhellman/hllmn/tree/master/item/content/blog/2023-09-18_h1x-net/src/mitm_h1.c ## Summary We tried to simply read the messages between two consoles playing Halo, but alas, we were immediately met by a stream of encrypted gibberish. By setting up a game instance in an emulator and attaching GDB to it we were able to peek at the contents of the secret messages that were sent between the instances. We also figured out that the game uses DES and TripleDES encryption and were able to decrypt the packets by peeking at the keys using GDB. Investigating further, we determined how the instances derive the keys and establish a security association during their initial handshake. By intervening in that process, we were able to perform a _man-in-the-middle_ attack, allowing us to read and _alter_ any packets that were sent between two consoles. Then, we finally took a look at the packets that were sent while playing. After figuring out their contents we were able to alter the packets from a guest client and implement a "speedhack" from a remote device for a game played by two _unmodified consoles_, using only a computer connected via a network switch. The authentication primitives are part of the Xbox kernel and are potentially common for all System Link games. The method demonstrated here should be possible to replicate in order to perform similar attacks for other Xbox games, potentially enabling other more interesting hacks. For example, would it not be cool if there was a game with a [stack buffer overflow][overflow] bug that allowed us to [run arbitrary code][ace] remotely? That could potentially enable us to [softmod][] an Xbox over the network, without the need of any special hardware[^softmod-hw]. This article has some discussion on [Hacker News][hn-comments]. [hn-comments]: https://news.ycombinator.com/item?id=37559499 [overflow]: https://en.wikipedia.org/wiki/Stack_buffer_overflow [ace]: https://en.wikipedia.org/wiki/Arbitrary_code_execution [^softmod-hw]: As far as I know, all current softmod methods still require special Xbox adapters in order to transfer the exploit program to the Xbox. [softmod]: https://en.wikipedia.org/wiki/Softmod