A deeper look at the tech behind Textile’s Threads

How Threads are woven into the fabric of Textile Photos

Written by Carson Farmer & Sander Pick

Download the app, take a picture, share!

Recently, we’ve started writing more about the technologies underlying Textile Photos that help keep your photos (and likes and comments, etc) safe and secure on the decentralized web. In our previous post, we talked about the encryption process Textile Photos, with a focus how Textile delivers end-to-end encrypted photo sharing. Today’s post is a follow-up (though it should also be sufficiently detailed to stand on its own), this time highlighting how Textile coordinates private photo sharing among groups of users, a feature we call Threads.

Why we built it
We designed Threads to allow groups of users to share photos securely and privately, without any centralized, authoritative database. We also made sure it all works well offline, that its possible to recover lost data, and that its easy to add new members.

What makes Threads different
Threads allow private groups to post photos and interact over a decentralized network, maintaining complete control over their own content. Textile operates in a completely zero-knowledge framework. Private by design.

Why Threads are exciting
Because photos are just the first step. Today, Threads allow users to share a photo with other Thread in a secure, decentralized way. Threads can facilitate secure sharing, coordination, and storage of many types of data over a decentralized network. Upgradable by design.

On the surface, you can think of each Thread like a decentralized database, shared between specific participants. We built Threads into the fabric of Textile (see what we did there 😉) because group members need a record of who shared what photo, and when. But, once we created Threads, we realized just how powerful a concept this was — for those familiar with mobile app development, think Realm or Firebase but without the centralized server.

To really understand what Threads brings to the table, you really need to understand Threads themselves. So let’s dig a bit deeper into how Textile conceptualizes and implements Threads, and how that helps keep your photos (and likes and comments, etc) safe and secure on the decentralized web. We’ll start by highlighting the specific requirements we had when developing Threads, and then break down each of these requirements into the specific solutions that we came up with. Along the way, our CTO Sander Pick will highlight how those various solutions came about, and why we think our approach is in the best interest of our users.

The experience

Textile Photos allows small, decentralized, private groups to share photos, send messages, and engage with each other. That’s the experience, so it has to ‘just work’.

Requirements

To drive the Textile user experience, we identified five key features needed in the sharing protocols:

A mechanism to share and receive state updates within a group of n users
To enable photo sharing (and other common interactions such as likes and comments) among a group of friends and/or colleagues, some concept of a shared state is required.
A way to ensure the shared state stays resilient to peers dropping out or latency issues
Since we’re operating in a mobile environment, we have to expect peers to continually drop ‘offline’ due to coverage issues, app back-grounding, battery optimizations, and a whole slew of other reasons for a mobile device to be cut off from a network.
A way to avoid state conflicts with other members of the group
On top of the requirements above, when peers do come back online, we don’t want any state changes that were made by other members of the group while they were disconnected from the network to conflict with their own local changes.
A mechanism to recover the full state from the network as a whole
Another important consideration in the mobile world is that the number of users (out of n) that are online at any given time is generally unknown, and quite possibly zero. To reiterate, we want a decentralized shared state, but it has to work even when you are the only member online. This means we have to assume the full group state may not ever be directly accessible (i.e., downloadable) from a single group member. This is in contrast to something like Bitcoin, where new nodes are able to download the full blockchain from any connected peer.
Way to link updates via their content, rather than where they are stored
Since we are building on top of the IPFS network, and would like to eventually support a Filecoin-based future in which users can select from a multitude of decentralized storage providers, Threads need to embrace content addressing, rather than location addressing. This makes it easy to grow and change the underlying network, without affecting data access and sharing.

With these requirements in mind, let’s break down our solutions into their individual components…

Solutions

1. Handling Updates — use a peer-to-peer network with structured updates

First things first: how do we handle state updates between a set of distributed peers? This is mostly about peer-to-peer (p2p) networking. And when it comes to communicating between heterogeneous network devices (computers, phones, IoT devices, etc), we actually need many different types of network protocols. That way, no matter what type of device we are talking about — be it a phone, desktop computer, browser, or Internet-enabled fridge — it is able to communicate with other devices located in the same room, or on the other side of the planet.

Decentralized, peer-to-peer networks are radically different types of communication networks.

At Textile, we use the super amazing libp2p library for our networking needs. Libp2p is a networking stack and library (you might have heard it called a protocol suite) modularized out of the IPFS project, and bundled separately for other tools to use. Essentially, libp2p does all the heavy network lifting so that we can focus on our core task: exchanging updates between communicating peers.

Like many IPFS-based projects, Textile uses Protocol Buffers for over-the-wire communication, and advanced cryptographic algorithms to secure those messages. Essentially, each update to the shared group state is just an encrypted Profobuf message with two parts: a header with author and date info, and a body with the type-specific data. These pieces are sent in their own inner-’envelope’ which contains a link to the encrypted message and the Thread ID. This inner-envelope is then signed by the sender and placed into the wire ‘envelope’ along with it’s signature. You can read more about some of the cryptographic tools Textile uses in this previous article. You can also check out how we structure our Protobuf messages, learn a bit more about how secio works, plus check out some recent updates to message encryption while you’re at it.

2. Network Resilience — support offline messaging so peers can come and go

If you are at all familiar with libp2p, then you might be thinking “ah libp2p has a pubsub layer that would be perfect for exchanging updates to a group of connecting peers”. And while you’d certainly be right, there are a few key limitations that makes using pubsub for something like Textile Photos pretty cumbersome. On top of this, while pubsub is super nice for things like chat rooms or distributed services, it is a ‘fire-and-forget’ messaging protocol, meaning that once a peer publishes a message, it is up to its peers to ensure they are listening for the right message at the right time. To circumvent this, some pubsub systems introduce message echoing, to ensure a message stays in the system long enough to be picked up by the peers who might need it. However, this can lead to really noisy network traffic, and is really just a band-aid over a larger issue.

So this starts to get at our second requirement, that the shared state stays resilient to peers dropping out. We need to assume peers might not be around to receive important messages in ‘real-time’, which is a common problem with p2p systems. Right now, Textile addresses this problem by enabling what you might call offline messaging. Since we’re already using IPFS for data storage and communication, we wanted to take advantage of some of the core technologies driving IPFS. In particular, we (currently) use a special fork of the Kademlia-based distributed hash table (DHT) used by IPFS, that allows us to post messages for a peer directly in the DHT. For those unfamiliar with DHTs, they are a hash table where the data is spread across a network of nodes or peers. And these peers are all coordinated to enable efficient access and lookup between nodes in a decentralized way. You can read more about this kind of stuff in our previous article about how IPFS peers find, request, and retrieve content (and each other) on the decentralized web. So, when a peer we want to communicate with is offline, rather than blindly sending them a message that will never be received, we post a message to Textile’s DHT, and they can then retrieve that message the next time they come online again. Conceptually simple, and works pretty well in practice.

There are still some issues with our current approach, including that it is difficult/impossible to remove messages from the DHT manually. Indeed, it can start to get a bit messy when left-over offline messages have to be retrieved each time a peer comes back online… imagine a peer that goes in and out of service frequently, this could lead to a lot of network traffic and wasted CPU cycles. So, we’ve implemented an alternative to this DHT-based offline messaging system that does not suffer from these limitations (and also allows us to participate in the public IPFS network), while still remaining decentralized and scalable in the long-term. This new approach should be released soon, after more testing and evaluation. You can follow along with this progress as part of the move towards a Cafe-based setup (see also What’s Next).

3 & 4. Avoiding Conflicts & State Recovery —use a CRDT to keep an immutable history across peers

Ok, so our next requirement and its associated solution have received a great deal of research and development attention over the years. The question of “how to avoid state conflicts with other members of a group?” comes up when working collaboratively on documents, updating shared databases, etc. For the purposes of updating a shared Thread of photos, it turns out that an operation-based CRDT that supports append-only operations is pretty much all you need to get going. You can think of Textile’s CRDT (which shares some ideas with ipfs-log) setup as an immutable, append-only tree that can be used to model a mutable, shared state between peers. Every entry in the tree is saved on IPFS, and each points to a hash of previous entry(ies) forming a graph. These trees can be 3-way and fast-forward merged.

Speaking of forks and joins, for those familiar with git and other similar system, you might be thinking this sounds a lot like a git hash tree, Merkle DAG, or even a blockchain. And you’d be right! The concepts are very similar, and this buys us some really nice properties for building and maintaining a shared state. By modeling our shared Thread state in this way, we benefit from tried and tested methods for allowing a peer to incorporate other peers’ updates into their state while maintaining history (via fast-forwards and three-way merging for example).

So what does this look like in practice? Currently — because things might change as we make improvements to the underlying implementation — each Thread in Textile Photos is essentially a chain of updates, where each update represents some specific action or event. For instance, when you create a new Thread, under-the-hood you are actually creating a JOIN update on a new Thread chain. Similarly, when you update the Thread via a new photo (DATA update), comment, or like (ANNOTATION update), you’re actually updating that Thread chain. After each modification, the HEAD of the Thread will point to the latest update.

To give you a better idea of what exactly we’re talking about, consider the following set of operations: User A creates a new Thread, and adds a Photo. They then externally invite User B (sent via some other secure communication channel), who eventually joins the Thread. But before User B is able to join the Thread, User A adds another Photo, moving the Thread’s HEAD forward. By the time User B joins the Thread, they’d end up with a Thread sequence that looks something like this:

Thread join example. Solid arrows point towards the ‘parent’ of a given update, over-the-wire communications are indicated with a 📶-style arrow, and messages that are rebroadcast (e.g., via the welcome message) are indicated with a dashed arrow. Similarly, merges point to both their parent updates.

Here, we see the merge happening at the end of the sequence because the bottom peer is joining via an external invite that is no longer HEAD , forcing them to merge the most recent DATA update with their own JOIN update. But since merge results are deterministic (given the same parents), both peers create the MERGE update locally, and do not broadcast them to avoid trading merges back and forth.

A more complete sequence is given in the following figure. Suppose User A goes ‘offline’ (e.g., their phone goes to sleep, they shut down the app, they lose their data connection, etc), and in the mean time, both Users A and B update the Thread, with User A adding an ANNOTATION update, and User B adding a new Photo (DATA update). Now, when User A comes back online, there is a conflict, and both Users create a MERGE update to remedy this. A MERGE update has two parents, in this case, the DATA and ANNOTATION update from the different users. As always, the HEAD continues to point to the latest update (which in the example below eventually becomes an ANNOTATION from User B). Once both peers are online again, the more straightforward update and transmit mode of operation can continue.

More complex Thread interaction where one or more peers are temporarily offline. Note that an external invite is the same as a normal invite, but the invite details are encrypted with a single use key, which is sharable with the invite update location.

The same properties that make hash trees or blockchains useful for developing a shared, consistent (consensus-driven) state, also makes it possible to address our fourth requirement: the ability to recover the full state from the network as a whole. Because each Thread update references its parent(s), given a single point on the Thread chain, we can trace back all the way to the beginning of the Thread. For example, at any point along the sequence in the above figures, a peer can trace back the history of the Thread, as indicated by the solid arrows. This works particularly nicely when a peer JOINs a thread, even at a point prior to the current HEAD. They can simply JOIN, and any existing Thread member can send them the latest HEAD (even via offline messages if needed). From here, they can explore the entire history of the Thread with ease. This is all really similar to git commit speak, in which one only needs to know about a single commit to be able to trace back the entire history of a code project; it’s also essentially how blockchains work.

5. Content Addressing — store everything on IPFS and get ready to scale

As we alluded to earlier, each update to a Thread is backed by an IPFS CID hash (i.e., they are content addressable chunks of data on IPFS). This means where the data is stored is no longer relevant… IPFS will find it on the network via it’s hash. This helps us address our fifth requirement, that we have a way to link updates via their content, rather than where they are stored. We’ve covered this topic a lot in the past, but for the uninitiated, the next paragraph provides a summary of how content addressing on IPFS works (pulled from this previous article).

Rather than referencing a file or chunk of data by its location (think HTTP), we reference it via its fingerprint. In IPFS and other such systems, this means identifying content by its cryptographic hash, or even better, a self-describing content-addressed identifier (multihash). A cryptographic hash is a (relatively) short alphanumeric string that’s calculated by running your content through a cryptographic hash function (like SHA). For example, when the (unencrypted) Textile logo is added to IPFS, its multihash ends up being QmbgGgWW3vH7v9FDxVCzcouKGChqGEjtf6YLDUgSHnk5J2. This ‘hash’ is actually the CID (Content IDentifier) for that file, computed from the raw data within that PNG. It is guaranteed to be cryptographically unique to the contents of that file, and that file only. If we change that file by even one bit, the hash will become something completely different.

Now, when we want to access a file over IPFS (like the above logo), we can simply ask the IPFS network for the file with that exact CID, the network will find the peers that have the data (using a DHT), retrieve it, and verify (using the CID) that it’s the correct file. What this means is we can technically get the file from multiple places because as long as the file matches the hash, we know we’re getting the right data. Which brings us to the solution to our final requirement… use IPFS! For now, Textile is maintaining a network of large, homogeneous, volunteer nodes (we call them Cafes) to ‘pin’ and store content on IPFS. It is important to note here that the other nodes doing the pinning are the same as the nodes on your phone — Textile Nodes that offer a pinning service to other peers. Soon, we’ll allow users to elect their own Cafe nodes, add even add additional nodes for redundancy. All this could eventually be driven by Filecoin for even greater scalablility and flexibility.

What’s Next?

So there you have it. Five solutions to five requirements for seamless, secure, decentralized photo sharing and backup. Easy 😉. And at a conceptual level, the Textile Thread protocol is relatively simple: blocks of operations chained together to produce a beautiful Thread of photos. But there’s a lot of complexity going on under-the-hood that has required a lot of experimentation, testing, and limit pushing, especially on mobile. And our journey isn’t over yet.

The Textile team is still hard at work iterating, updating, and improving upon what we already have working. For example, we’ll soon to moving to a new offline messaging system that allow us to drop the custom DHT fork, and move back to the public IPFS network. On top of this, our move to more powerful backup and recovery capabilities has us taking new approaches to security, profile management, offline interactions, and much much more. On top of these changes, the team is actively working to modularize the Threads concept and code into its own stand-alone package, which should provide developers with something akin to a Realm and/or Firebase layer for decentralized mobile applications!

If you are interested in learning more about this stuff, reach out over Twitter or Slack, or pull us aside the next time you see us at a conference or event. We’re happy to provide background, thoughts, and opinions on how we think the future of decentralized apps will play out. In the mean time, don’t forget to check out our GitHub repos for code and PRs that showcase our current and old implementations. We try to make sure all our development happens out in the open, so you can see things as they develop. Additionally, if you haven’t already, don’t miss out on signing up for our waitlist, where you can get early access to Textile Photos, the beautiful interface to Textile’s Threads.