"fmsg" is the name given to the protocol and message definitions described in this document. The name "fmsg" is neither an abbreviation nor acronym, instead is thought of as "f-message". The "f" is inspired from popular programming languages such as C's printf where the "f" stands for "formatted", "msg" is a common shortening of "message" conveying the meaning while keeping the whole name succinct; "fmsg".
"DNS" is for the Domain Name System
"host" is an fmsg implementation which can send and receive fmsg messages to and from other hosts.
"message" refers to an entire message described in Message definition.
"message header" refers to the fields up to and including the attachment headers field in a message.
"thread" is a linked heirarchy of messages where messages relate to previous messages using the pid field
"UTF-8" is for the unicode standard: Unicode Transformation Format – 8-bit.
fmsg defines four message types: MESSAGE, CHALLENGE, CHALLENGE RESPONSE and "REJECT or ACCEPT RESPONSE". These structures are aggregates of Data Types and are described in the Definition section.
Throughout this document the following data types are used. All types are encoded little-endian.
| name | description |
|---|---|
| uint8 | 8 bit wide unsigned integer with a value in the set 0 to 255 |
| uint16 | 16 bit wide unsigned integer with a value in the set 0 to 65535 |
| uint32 | 32 bit wide unsigned integer with a value in the set 0 to 4294967295 |
| bit | single bit 0 or 1 within one of the uint types, the 0 based index of which is defined alongside in this document |
| float64 | 64 bit wide number in the set of all IEEE-754 64-bit floating-point numbers |
| byte | a uint8 |
| byte array | sequence of uint8 values the length of which is defined alongside in this document |
| bytes | a byte array |
| string | sequence of characters the length and encoding (e.g. ASCII, UTF-8...) of which is defined alongside in this document |
String lengths are always explicitly defined and null terminating characters are not used. This is a design decision becuase it prevents a class of buffer over-run bugs (search "Heartbleed bug"), simplifies message size calculation, and, inherently limits the length of strings while adding no extra data than a null terminating character would (since all strings lengths here are defined by one uint8).
In programmer friendly JSON a message could look like (once decoded from the binary format defined below):
{
"version": 1,
"flags": 0
"pid": null,
"from": "@markmnl@fmsg.io",
"to": [
"@世界@example.com",
"@chris@fmsg.io"
],
"time": 1654503265.679954,
"topic": "Hello fmsg!",
"type": "text/plain;charset=UTF-8",
"size": 45,
"data": "The quick brown fox jumps over the lazy dog.",
"attachments": [
{
"size": 1024,
"filename": "doc.pdf"
}
]
}On the wire messages are encoded thus:
| name | type | description |
|---|---|---|
| version | uint8 | A value less than 128 is the fmsg version number; otherwise this message is a CHALLENGE which is defined below. |
| flags | uint8 | See flags for each bit's meaning. |
| [pid] | byte array | SHA-256 hash of message this message is a reply to. Only present if flags has pid bit set. |
| from | fmsg address | See address definition. |
| to | uint8 + list of fmsg address | See address definition. Prefixed by uint8 count, addresses MUST be distinct (case-insensitive) of which there MUST be at least one. |
| time | float64 | POSIX epoch time message was received by host sending the message. |
| topic | uint8 + [UTF-8 string] | UTF-8 free text title of the message thread, prefixed by unit8 size which may be 0. |
| type | uint8 + [ASCII string] | Either a common type, see Common Media Types, or a US-ASCII encoded Media Type: RFC 6838. |
| size | uint32 | Size of data in bytes, 0 or greater |
| attachment headers | uint8 + [list of attachment headers] | See attachment header definition. Prefixed by uint8 count of attachments of which there may be 0. |
| data | byte array | The message body of type defined in type field and size in the size field |
| [attachments data] | byte array(s) | Sequential sequence of octets boundries of which are defined by attachment headers size(s), if any. |
- Square brackets "[ ]" indicate fields or part thereof may not exist on a message. Where the brackets surround the name, e.g. pid, the whole field my not be present (which in the case of pid is only valid if the message is not a reply). Where they surround part of the type, that part may not be present, e.g. list of attachment headers will not be present if unit8 prefix is 0.
- Topic is set only on the first message sent in a thread, thereafter topic size is always 0. Making topic immutable because it cannot be changed by subsequent replies. (Presentations of message threads COULD use a local mutable field for display purposes).
Only one time field is present on a message and this time is stamped by the sending host when it acquired the message. (Implementations COULD associate additional timestamps with messages, such as the time message was delivered).
fmsg includes some time checking and controls, rejecting messages too far in future or past compared to current time of the receiver, and, checking replies cannot claim to be sent before their parent (See Reject or Accept Response). Of course this all relies on accuracy of clocks being used, so some leniancy is granted determined by the receiving host. Bearing in mind a host may not be reachable for some time so greater leniancy SHOULD be given to messages from the past. Since the time field is stamped by the sending host – one only need concern themselves that their clock is accurate.
| bit index | name | description |
|---|---|---|
| 0 | has pid | Set if this message is in reply to another and pid field is present. |
| 1 | common type | Indicates the type field is just a uint8 value and Media Type can be looked up per Common Media Types |
| 2 | important | Sender indicates this message is IMPORTANT! |
| 3 | no reply | Sender indicates any reply will be discarded. |
| 4 | no challenge | Sender asks challenge skipped, hosts accepting unsolicited messages SHOULD be cautious accepting this, especially on the wild Internet. |
| 5 | deflate | Message data is compressed using the zlib structure (defined in RFC 1950), with the deflate compression algorithm (defined in RFC 1951). |
| 6 | Unused, reserved for future use | |
| 7 | under duress | Sender indicates this message was written under duress. |
If the common type flag bit is set in the flags field, then type field consists of one uint8 value which maps to the Media Type including parameters in the table below. A value not in the table is invalid and the entire message SHOULD be rejected with "invalid" REJECT response. If the common type bit is not set the first uint8 is the length of the subsequent bytes US-ASCII encoded Media Type per RFC 6838. Note, even if the common type flag bit is not set (i.e. the Media Type is spelt out in full), the Media Type may be one of these "common" types.
For reference the current IANA list of Media Types is located here.
Numerical identifier to common Media Types mapping.
| number | Media Type |
|---|---|
| 1 | application/epub+zip |
| 2 | application/json |
| 3 | application/msword |
| 4 | application/octet-stream |
| 5 | application/pdf |
| 6 | application/rtf |
| 7 | application/vnd.amazon.ebook |
| 8 | application/vnd.ms-excel |
| 9 | application/vnd.ms-fontobject |
| 10 | application/vnd.ms-powerpoint |
| 11 | application/vnd.oasis.opendocument.presentation |
| 12 | application/vnd.oasis.opendocument.spreadsheet |
| 13 | application/vnd.oasis.opendocument.text |
| 14 | application/vnd.oasis.opendocument.text-web |
| 15 | application/vnd.openxmlformats-officedocument.presentationml.presentation |
| 16 | application/vnd.openxmlformats-officedocument.spreadsheetml.sheet |
| 17 | application/vnd.openxmlformats-officedocument.wordprocessingml.document |
| 18 | application/xhtml+xml |
| 19 | application/xml |
| 20 | application/zip |
| 21 | audio/aac |
| 22 | audio/midi |
| 23 | audio/ogg |
| 24 | audio/opus |
| 25 | audio/wav |
| 26 | audio/webm |
| 27 | font/otf |
| 28 | font/ttf |
| 29 | font/woff |
| 30 | font/woff2 |
| 31 | image/apng |
| 32 | image/avif |
| 33 | image/bmp |
| 34 | image/gif |
| 35 | image/jpeg |
| 36 | image/png |
| 37 | image/svg+xml |
| 38 | image/tiff |
| 39 | image/webp |
| 40 | text/calendar |
| 41 | text/css |
| 42 | text/csv |
| 42 | text/markdown |
| 43 | text/html |
| 44 | text/javascript |
| 45 | text/plain;charset=ASCII |
| 46 | text/plain;charset=UTF-16 |
| 47 | text/plain;charset=UTF-8 |
| 48 | text/vcard |
| 48 | video/H264 |
| 49 | video/H264-RCDO |
| 50 | video/H264-SVC |
| 51 | video/H265 |
| 52 | video/H266 |
| 53 | video/ogg |
| 54 | video/VP8 |
| 55 | video/VP9 |
| 56 | video/webm |
| 57 | model/3mf |
| 59 | model/gltf-binary |
| 60 | model/obj |
| 61 | model/stl |
| 62 | model/step |
Attachment headers consist of the two fields, filename and size:
| name | type | comment |
|---|---|---|
| filename | string | UTF-8 prefixed by unit8 size. |
| size | unit32 | Size of attachment data. unit32 is the max theoretical size, but hosts can/should accept less. |
filename MUST be:
- UTF-8
- any letter in any language, or any numeric characters (
\p{L}and\p{N}Unicode Standard Annex #44 and #18) - the hyphen "-" or underscore "_" characters non-consecutively and not at beginning or end
- unique amongst attachments, case-sensitive
- less than 256 bytes length
Attachment data
| name | type | comment |
|---|---|---|
| data | byte array | Sequence of octets located after all attachment headers, boundaries of each attachment are defined by corresponding size in attachment header(s) |
Domain part is the domain name RFC-1035 owning the address. Recipient part identifies the recipient known to hosts for the domain. A leading "@" character is prepended to distinguish from email addresses. The secondary "@" seperates recipient and domain name as per norm.
Recipient part is a string of characters which MUST be:
- UTF-8
- any letter in any language, or any numeric characters (
\p{L}and\p{N}Unicode Standard Annex #44 and #18) - the hyphen "-" or underscore "_" characters non-consecutively and not at beginning or end
- unique on host using case-insensitive comparison
- less than 256 bytes length when combined with domain name and @ characters
A whole address is encoded UTF-8 prepended with size:
| name | type | comment |
|---|---|---|
| address | uint8 + string | UTF-8 encoded string prefixed with uint8 size |
| name | type | comment |
|---|---|---|
| version | uint8 | Challenge version, decrements from 255 coressponding to fmsg protocol version, 255 is CHALLENGE for fmsg protocol version 1, 254 would be CHALLENGE for fmsg protocol version 2 etc. |
| header hash | 32 bytes | SHA-256 hash of message header being sent/received up to and including type field. |
A challenge response is the next 32 bytes received in reply to challenge request – the existance of which indicates the sender accepted the challenge. This SHA-256 hash MUST be kept to ensure the complete message (including attachments) once downloaded matches.
| name | type | comment |
|---|---|---|
| msg hash | 32 byte array | SHA-256 hash of entire message. |
A code less than 100 indicates rejection for all recipients and will be the only value. Other codes are per recipient in the same order as the as in the to field of the message excluding recipients for other domains.
| name | type | comment |
|---|---|---|
| codes | byte array | a single or sequence of unit8 codes |
| code | name | description |
|---|---|---|
| 1 | invalid | the message is malformed, i.e. not in spec, and cannot be decoded |
| 2 | unsupported version | the version is not supported by the receiving host |
| 3 | undisclosed | no reason is given |
| 4 | too big | total size exceeds host's maximum permitted size of messages |
| 5 | insufficent resources | such as disk space to store the message |
| 6 | parent not found | parent referenced by pid not found |
| 7 | past time | timestamp is too far in the past for this host to accept |
| 8 | future time | timestamp is too far in the future for this host to accept |
| 9 | time travel | timestamp is before parent timestamp |
| 10 | duplicate | message has already been received |
| 11 | must challenge | no challenge was requested but is required |
| 12 | cannot challenge | challenge was requested by sender but receiver is configured not to |
| 100 | user unknown | the recipient message is addressed to is unknown by this host |
| 101 | user full | insufficent resources for specific recipient |
| 200 | accept | message received, no more data |
A message is sent from the sender's host to each unique recipient host (i.e. each domain only once even if multiple recipients with the same domain). Sending a message either wholly succeeds or fails per recipient. During the sending from one host to another several steps are performed depicted in the below diagram. Two connection-orientated, reliable, in-order and duplex transports are required to perform the full flow. Transmission Control Protocol (TCP) is an obvious choice, on top of which Transport Layer Security (TLS) may meet your encryption needs.
Protocol flow diagram
Following the example of @A@example.com is sending a message to @B@example.edu
-
Connection Initiation and Header Exchange
- The Sending Host (Host A) initiates a connection (Connection 1) to the Receiving Host (Host B).
- Host A begins sending the message to Host B.
- Host B downloads the message header, parses it, then MUST perform a DNS lookup on the _fmsg subdomain of the from address in the message header (_fmsg.example.com) to verify that the IP address of the incoming connection is in those authorised by the sending domain. If the incoming IP address is not in the authorised set, Host B MUST terminate the connection. See Domain Resolution.
-
The Automatic Challenge
- Host B MUST initiate a separate connection (Connection 2) back to Host A using the same incoming IP address.
- Host B sends a CHALLENGE to Host A, supplying the hash of the message header received in Connection 1.
- Host A MUST verify the authenticity of the challenge by checking the header hash matches a message currently being sent to Host B.
- If not matched then the connection MUST be terminated.
- If matched, Host A transmits a CHALLENGE RESP on Connection 2 consisting of the SHA-256 hash of the message.
-
Reject or Message Download Continuation
- Host B performs final checks on the CHALLENGE RESP then either rejects the entire message outright; or continues to download the message on Connection 1. A REJECT response at this stage allows the receiving host a chance to reject the message before continuing the download for any reasons e.g. the message to too big.
- REJECT MUST apply to all recipients belonging to Host B, i.e. "REJECT or ACCEPT RESPONSE" code must be less than 100, see: Reject or Accept Response.
- REJECT MUST be sent on Connection 1.
- REJECT, if sent, MUST immediately be followed by closure of Connection 1 upon which the message exchange is completed.
- Connection 2 MUST be closed, initiated by Host B.
- If not rejected, the message transmission continues on Connection 1. Host B completes the download of the full remaining message, i.e. remaining bytes totalling message size + the sum of any attachment sizes.
- Host B performs final checks on the CHALLENGE RESP then either rejects the entire message outright; or continues to download the message on Connection 1. A REJECT response at this stage allows the receiving host a chance to reject the message before continuing the download for any reasons e.g. the message to too big.
-
Integrity Verification and Disposition
- Host B MUST perform a message integrity check by calculating the SHA-256 hash of the fully downloaded message including header, data and any attachments; then compare this calculated hash against the hash provided in the CHALLENGE RESP earlier.
- If hashes do not match Host B MUST TERMINATE the connection.
- If the hashes match, Host B transmits an "ACCEPT or REJECT RESPONSE" code to Host A for each individual recipient belonging Host B.
- Host A MUST record the "ACCEPT or REJECT RESPONSE" per recipient.
- Host A and Host B gracefully close Connection 1, completing the message exchange.
- Host B MUST perform a message integrity check by calculating the SHA-256 hash of the fully downloaded message including header, data and any attachments; then compare this calculated hash against the hash provided in the CHALLENGE RESP earlier.
Hosts MUST obtain and verify authorised IP addresses by resolving the subdomain _fmsg of the domain name in an fmsg address and evaluating the resulting A and AAAA records (including those obtained via CNAME aliasing). For example if @alice@example.com is sending a message to @bob@example.edu, Alice's authorised fmsg host IP addresses are obtained by resolving _fmsg.example.com, and Bob's from _fmsg.example.edu.
Sending and receiving hosts SHOULD perform DNSSEC validation for _fmsg lookups when supported. If DNSSEC validation fails, the the conenction MUST be terminated.
Before opening the second connection to send CHALLENGE, the receiving host MUST independently resolve the senders authorised IP set from the _fmsg subdomain and verify the originating IP address of the incoming connection is in that set. If verification fails the connection MUST be terminated without challenging. This ensures the fmsg host sending a message is listed by the senders domain and prevents orchestrating a denial-of-service style attack by falsifying an address to trigger many fmsg hosts challenging an unsuspecting host.
Various alternatives were considered before arriving at using the _fmsg subdomain method. For instance an MX record combined with a WKS record on the domain would align with original intent of RFC 974 allowing message exchange services to be located for a domain along with WKS specifying the protocol. However the intent of MX records has been superceded by RFC 1123 and is now assumed to be SMTP and WKS is obsolote. Using a TXT record as SPF does was considered too, but that leads to a growing problem of proliferation of TXT records. So the _fmsg subdomain method was chosen as it allows the receiver to verify that the originating host of a message is explicitly authorized by the owning domain. Also, because the incoming IP address and sender's domain will be known to the receiving host, only one domain lookup is needed.
Verifying the sender's IP address requires the receiving host to observe the true originating IP address of the connection. This implies that fmsg hosts must be directly routable, or that any intervening infrastructure preserves and conveys the originating IP address. Care must therefore be taken when fmsg hosts operate behind network address translators (NAT), layer-4 load balancers, or proxying infrastructure.
