# Proone Software Design Spec This document is part of **Proone Worn Project**. For overview, refer to [README.md](/README.md). * TODO structure * TODO workers and functions * TODO CNC TXT REC * TODO dvault * TODO IPv6 * TODO classes ## Subsystems ### Heartbeat **Heartbeat** is a subsystem of Proone that consists of a backdoor and CNC mechanism on infected devices. **The Heartbeat protocol** is an point-to-point or a broadcast framing protocol that works over a transport stream such as TCP/IP. The protocol is documented separately in **[Protocol Spec](proto.md)**. The overview of the protocol is followed below. Heartbeat subsystem takes up a large portion of the Proone code base. The subsystem mainly works as a format in DNS TXT records and a TCP/IP framing protocol. A complete heartbeat connection consists of an **authoritive end** and a **submissive end**. In the Heartbeat protocol, a request is usually initiated by one initiating frame by one end and the other end responds to it by one or more response frames. The protocol also employs the concept of "protocol upgrade" like that of WebSocket in which both request and response frames can be streams of frames. A request-response session is distinguished by a message id number. A message id number is generated by the end which initiated the session and is used for the duration of the session. The idea behind having message id number is to make the protocol pipe-lineable so that simple request-response pairs can be processed in parallel. This is merely a future-proof design and does not play a significant role. Unlike conventional botnets, Proone instances(aka "bots") are controlled by TXT DNS records containing one or more request frames of an authoritive end. In this scheme, a request is initiated by Proone instances acting as a submissive end quering and reading the contents of the TXT records. Any response data resulted in the process is discarded. The heartbeat protocol binary is represented in base64 encoding because most DNS management software do not accept binary data for the value of TXT records although [the RFC spec](https://datatracker.ietf.org/doc/html/rfc1035#section-3.3) does not impose such restriction. Only public DNS servers which support DNS over TLS are used to counter lawful interception. The reason being, the DNS protocol is not encrypted and ISPs or law enforcfements can easily filter out TXT REC CNC traffic simply by doing plain-text string search. A TLS library is used to implement the SSH attack vector, so using the library for another purpose was an enticing choice. Proone queries public DNS servers directly rather than using system functions. This eliminates the chance of letting ISP DNS servers giving false results. Using public DNS servers is also beneficial since law enforcements would have to take down the domain itself as it would be difficult to convince the operators of public DNS servers to block a recursive query to a particular name server. Another benefit is not having to run CNC servers for simple tasks like running shell scripts. There are 2 recommended applications. One typical application is having a `PRNE_HTBT_OP_HOVER`(Hand Over Operation) request frame in TXT records so that instances will connect to servers running authoritive htbt implementations for furthur instructions. The second application is having a `PRNE_HTBT_OP_RUN_CMD`(Run Command Operation) frame or a `PRNE_HTBT_OP_RUN_BIN`(Run Binary Operation) containing a simple minified shell script for instances to run. Using CNC TXT records to transfer a large amount of data is possible but not recommended. In theory, doing `PRNE_HTBT_OP_NY_BIN`(Binary Upgrade Operation) with CNC TXT REC is possible. However, For Proone instances, quering TXT records, decoding base64 data and running a slave heartbeat client is costly operation. It's not a simple task and prone to failure. ### Use Cases To stop all Proone instances, issue command `kill -9 0` or `reboot -nf` with detach flag unset. To disable all hosts, issue command `half -nf`. In order to do things of complexity, it's recommended to implement an authoritive server implementation and command Proone instances to take orders from the servers running the implementation. Load balancing can be done at the DNS level using techniques like round-robin DNS or GeoDNS. Once a Proone instance connects to an authoritive server, the server can fully utilise the heartbeat protocol to do the tasks described below. Shell scripts can be run on Proone hosts with `PRNE_HTBT_OP_RUN_BIN`(Run Binary Operation) as long as the script contains a shebang line at the very start of the script. Note that most embedded devices run lightweight shells like Ash(BusyBox) and Toysh(Toybox)[^1]. The best is strategy is targetting Bourne shell, which has been a default shell for the majority of systems(historically). To make hosts run an arbitrary binary executable, `PRNE_HTBT_OP_HOST_INFO`(Host Info Operation) can be used to query the archeticture type of the host to select a suitable binary for upload. To replace the Proone binary, `PRNE_HTBT_OP_NY_BIN`(Binary Upgrade Operation) can be used. The binary format for the operation is specified in a [separate document](proto.md). Upon successful upload, the Proone instance will attempt to `exec()` to the new binary after **binary recombination**(explained in a separate section) is performed. All this is done in the parent process. In the event of failure, Proone continues to operate with the existing binary. The only way to check the result of the operation is through reestablishing the connection to the Proone instance and querying the version of the binary through `PRNE_HTBT_OP_HOST_INFO` request. The protocol leaves room for implementing M2M mechanisms. A Proone instance checks if the target host is already infected by attempting to connect to a **local back door**(or simply, **LBD**) on the target host. An LBD port is served by a submissive Heartbeat client. The future versions of Proone can utilise the LBD port to update the binary of the target instance if old one is encountered. **proone-htbtclient** can be used to examine and maintain the Proone instance via this port. ### Binary Archive and Data Vault Proone aims to be a decentralised botnet. To spread without binary distribution servers, Proone carries all the executables of arch types it supports. For this, a special file structure is designed. The **Data Vault**("**DVault**") is a binary block containing large and sensitive data necessary for operation of Proone. DVault is a kempt version of the data table of Mirai. DVault also helps reduce the size of Proone. Each executable contains the *.data* section. If there's a long string in the program, the value of the string will end up in each *.data* section of the executables. Compression leviates this issue but there's a limit because the size of data dictionary blocks can only get big. Having a custom *.data* section for large data solves this issue at the cost of the size of code for fetching and unmasking values from DVault. This implies that, in some cases, storing static values in the *.data* section of an ELF is more efficient[^2]. Another purpose of DVault is masking sensitive data like `PRNE_DATA_KEY_CNC_TXT_REC` and `PRNE_DATA_KEY_CRED_DICT` so that they're not revealed when `strings` command is run on the executable or when the process is core dumped. DVault is loaded when Proone initialises. The loaded contents remain in memory masked and unmasked only when needed. The contents of DVault are XORed with a 256 byte array of random numbers generated on each compilation. This process makes it impossible to compress the DVault binary block because of high entropy. Therefore it's not recommended to use DVault to store exceptionally large values. This issue may be solved by compressing the value separately at the cost of CPU time. The **Binary Archive**("**BA**") is a binary block containing compressed executables and an index of the executables. ## Requirements ### Targetting Wide Range of Devices and Kernel Configurations A number of methods has been employed in efforts to target a wide range of Linux devices. The assumption is that there are still devices running old images of Linux and targetting these devices means coding up to the standard of old POSIX specs and testing under old versions of Linux(namely 2.6.x). `_POSIX_C_SOURCE=200112L` macro is defined to meet this requirement. Note that using this macro does not give you an error when you accidentally use APIs not in the 200112L standard. The compiler will only give you a warning and your code will compile just fine. If you happen to use a function that the kernel of the host does not support, the syscall will fail with `ENOSYS`. If the feature requiring the new API can be silently switched off at runtime, removal of the macro is recommended. The Linux kernel is highly configurable. Pesudo file systems and the device file system may not be present on a Linux host since they can be disabled. Disabling any of these file systems is unusual for PCs but practical on embedded devices. Proone do not assume that these file systems are available on the host and try to run without using them if not available. ### Running Lean Proone is designed under the assumption that honouring other processes on the system will decrease the change of getting caught by system administrators. Proone is compartmentalised so that it's somewhat immune to syscall fails. This design is to counter `ENOMEM` as it runs lean on lean embedded systems. This implies that proone can be initialised "half-complete". For example, it can be initialised with all the workers running except the Heartbeat worker. In this case, proone will be able to infect other devices on the network while unable to respond to CNC TXT REC. Another notable case would be an instance running without the Recon worker. It will respond to the CNC TXT REC and serve the local backdoor connections while unable to infecting the other devices on the network. Proone does not reattempt to start the workers it failed to run on start. The assumption is that the system is already running with its memory full to the brim and it's futile to wait for resource it failed to claim as it's likley that the other services on the system will claim the reource at some point. Proone does cooperative multitasking by using **Pthsem** library. This is one of many efforts to "run lean" whereby restricting CPU usage to one logical thread. This may seem as a huge missed opportunity if Proone scores infecting itself onto a beefy multi-core system. Keep in mind that Proone is designed to run on resource-scarce embedded devices. Most poorly-designed vulnerable devices will be single core, anways. The strategy is getting the most small-powered devices infected rather than having a few infected high-performance systems. ### Volatile Operation TODO ## Dependencies The dependencies for Proone have been kept to absolute necessities. **libssh2** is used for the SSH brute force vector. Coupled with libssh2's SSL backend is **Mbedtls** for TLS connection to public name servers and the Heartbeat protocol. **zlib** is used to implement binary archive. All the libraries are compiled with default configurations. **Pthsem** is used for threading. **libyaml** and **mariadb-connector-c-devel** is required for **hostinfod** build. YAML has been chosen for the configuration file format and MariaDB for DB backend. [^1]: Maybe in the future when Toybox gains marketshare? [^2]: i.e. representing values in code: `int value = 123;`