Cinode received a lot of improvements and reached some milestones recently thus I wanted to take a step back and create a summary of Cinode as it is right now. If you know nothing about Cinode - then this post is especially for you. Let’s start with some high-level technical overview followed by Q&A to clarify details.

Layers, layers everywhere

Cinode is a layered protocol with three main layers: public, private and publisher.

Architecture overview

📨 Public layer

On the public layer nodes form a distributed network to communicate between each other and exchange information. The main assumption here is distrust - nodes assume that any other player in the network could be malicious. For that reason any data transferred between nodes is cryptographically validated.

Everything that moves between nodes is also encrypted and nodes themselves are not able to get the plaintext data at this layer. There are simply no encryption keys here that would allow decryption, not in the usable form. That way the original, unencrypted dataset can not leak (some information leaks though which I discuss in the Q&A). It is of course necessary that cryptographic validation of exchanged data happens on the encrypted dataset which affects the way validation algorithms are designed.

This layer is also the main storage area of Cinode. Of course data blobs are stored here in an encrypted form which improves safety, anonymity and privacy. Other layers should not directly persist the data. That constraint is to minimize the risk of unauthorized data recovery by gaining access to the storage medium of any node.

This layer is called Public because it is assumed that the data in this layer may be publicly available. If that happens, potential information leakage should be minimal.

📃 Private layer

Data from the public layer is consumed in the private one. On this level the data is decrypted and converted to more useful structure that looks like a filesystem. That on the other hand allows writing tools such as web proxy where this filesystem-like structure is exposed through a web server.

Such server is a perfect use case for static web pages such as this blog and in fact it is deployed using Cinode. But other use-cases are also possible. A bit more advanced web server that will also be able to publish new data to the network through a built-in Cinode API is already on the roadmap.

The layer is called Private to emphasize that we operate on some form of private information here and this information should be protected from leakage. For that reason, this layer should avoid keeping any unencrypted data for long time or store it on a durable storage. If possible, decryption should also happen close to the end user that reads the data - e.g. there could be a small Raspberry PI with a Cinode web proxy server attached to a local home network and only available within that network.

Of course privacy of this layer is not strictly enforced. It depends on the use case. Take an example of this blog - it is exposed publicly on purpose so that anybody can read it. But there could be for example an alternative version that includes drafts of future posts. Such version would be available only to reviewers prior to publication.

✏️ Publisher layer

This layer is currently realized as an offline generation process. Here the encrypted dataset used in the public layer is being generated. As na example, this blog for has a dedicated CI/CD pipeline where the data is compiled into a public dataset. Once generated such dataset is then published on the public layer.

Publishing data on the public layer can be done in many ways. Data blobs can be sent into an already running node, new dataset could be exposed as a completely new node, data could be sent using a physical mail on a DVD to someone who injects the data to the network in his own way etc. The point is to ensure that other nodes in the network that need the data can somehow see it. If needed, tiny blobs can be even sent using small peaces of paper attached to pigeons 🕊 - all will still work as long as the data finally reaches interested nodes 😉.

💾 Data blobs

All data that’s floating around in Cinode is split into encrypted blobs. Those are basic units of information. Each blob is identified by its own name and contains some encrypted data associated with it. The name is a crucial part of cryptographic validation of the blob and it must match the data that comes with it. I’ll leave more detailed description about this validation process for some other post. Some drafts of the idea can be found in earlier posts: 1, 2 and I did a brief description below and in the Q&A section.

There’s one basic rule that applies to all blobs which I call the rule of forward progress. Due to the distributed nature of the network, in case of dynamic links (and other potential future blob types), there may be multiple versions of a blob having the same name. The rule of forward progress requires that there must be a deterministic way to resolve a conflict between those versions and to ensure that the network as a whole drifts towards a unified view of all blobs.

How this rule is realized depends on blob types. Dynamic links do this by comparing link versions and some other properties.

In case of static blobs the blob name is tightly coupled with the content so there can be only a single version of a blob for given name. That’s because even a slightest change in the data will change that name. I tend to say that for static blobs the immutability of the content guarantees forward progress - there will be no ambiguity when resolving the conflict because the conflict will not happen (well, if SHA-2 ever gets broken ambiguity may happen, but then everybody would have to abandon SHA-2 including Cinode).

Above raw blob data there’s a filesystem-like abstraction. There are many similarities to modern filesystems too, e.g. Cinode dynamic links are like posix hard links. Also blobs can be shared between different cinode filesystem trees implementing something that works like a copy-on-write mechanism. Overall, many filesystem trees overlap forming one global graph of filesystem structures.

❓ Q&A

Technical

Q: What is Cinode?

  • Distributed network - nodes connect to each other
  • Framework - base building block for various applications
  • Protocol - defined set of rules on how data travels through the system

Q: What Cinode is not?

  • Not a blockchain - there’s no global consensus mechanism, blocks, mining etc., no central point, even conceptually
  • Not a cryptocurrency - Cinode was never designed to handle the concept of money or ownership, that requires central consensus, let’s leave it to blockchains
  • Not a replacement for the Internet - it lacks crucial features to replace Internet but may be an important addition
  • Not a web application - it is a framework to build apps, it’s not a web app itself
  • Not an organization nor company - just a side project built out of curiosity

Q: How secure it is?

I’m using well-known encryption algorithms such as ed25519, SHA-2 or xchacha20. But in few places I’ve broken the golden rule of cryptography and came up with new cryptographic building blocks. And the design of data flow is usually what breaks apart and where vulnerabilities are found. For that reason I would strongly suggest to wait for some more formal analysis by myself or someone else before trusting the system with any sensitive data. Things may be broken in Cinode - learning how to find vulnerabilities in a bit more complex network was and still is one of my goals for this project.

Q: How Cinode stores data?

All Cinode data is stored in blobs. Those blobs have name and the name is derived from the content of the blob, how its done is up to specific blob types (I’ll post about that in the future) but a cryptographic hash function is always involved. The name of the blob does not specify where such blob is located in the network (location erasure sounds like a good name for this property). Upper layers reference blobs using their names and the storage layer must take care of finding and propagating necessary data.

There are currently two types of blobs:

static - those contain datasets of arbitrary size, can contain any kind of data - html pages, images, executable code etc., the name of such blob is based on the hash of its content similarly to how it’s done in IPFS, git, bittorrent and many more.

dynamic links - those contain a tiny amount of information that points out to some other blob, under the hood those are using public/private key pairs - just like TLS certificates, the data in the link is signed with the private key, the name of the blob is derived from the public key. Name does not depend on the link data itself which means the targeted blob may be changed over time. You may think about dynamic links as branches in git - they point to some hash but it can be updated to point to different commit.

Q: How network avoids corrupted blobs?

Any blob retrieved by a node must be checked according to:

  • blob name validation rules - both static blobs and dynamic links have name validation rules and invalid blobs are rejected, e.g. if a dynamic link data is signed with the wrong key that does not match the name, it will not pass the validation
  • forward progress rules - this currently only applies to dynamic links and determines a way to select one of two versions of the link if there’s a conflict, in practice this means that a newer version of the link will always be preferred over the older one

Q: Some information can be observed on the public layer even if the data is encrypted?

True, some metadata can still leak - things such as the amount of data that is sent over the network, what blobs are popular, maybe some frequency of updates. But there’s a potential to add some countermeasures - e.g. if the data appears somewhere in the network it does not prove that this is the origin, blobs may be floating in a hidden form around the network first before they are published in some random place. Or there may be random garbage data injected to nodes that can not be distinguished from valid data. I believe many more solutions will be discovered.

Q: How about scalability?

The network operates on blobs of data, those blobs can be cheaply replicated. If necessary those could be stored on some S3 server or put behind CDN network. On the other hand, once DHT is in place, more popular blobs will propagate to more nodes thus it will automatically scale simply due to popularity. This is what we can see in bittorrent or IPFS for example.

Q: How Cinode compares to blockchains?

Each blockchain is built around central ledger and the network is working as a whole on protecting that single source of truth. That’s something centralized though even though only conceptually. In Cinode there’s no such thing. Things are done locally and data validation happens on a single blob level. There’s no global consensus.

Also blockchains usually don’t store large amounts of data. Cinode is meant to handle larger datasets.

Q: How is Cinode different from other projects like that?

In Cinode the ownership of data is not tied to servers. Data is always stored in an encrypted form and the base network layer is designed in such a way that propagation and validation of data can happen without decryption. The forward progress rule additionally ensures that data updates will propagate throughout the network in a deterministic way.

There are many other similar projects out there, IPFS, git, bittorrent and even more recent nostr. Never did a throughout analysis and comparison though, the goal of Cinode is not to compete but explore technology.

Plans

Q: What’s the current status?

It is in a proof-of-concept phase. Only the most important features are implemented. There’s no optimization at all, no scale testing. I try to keep the project usable in some way and choose the roadmap considering small improvements that end up with something usable.

The backend of this blog is powered by Cinode. And it’s working. But that’s the only deployment of Cinode I’m aware of when writing this post.

Cinode is still my pet project. Something that I use to explore new ideas, learn new things, learn on mistakes too.

Q: How can I play with Cinode myself?

There will be a series of blog posts showing how to use Cinode. You can also view this blog through your own cinode proxy which i wrote about here. There are not yet precompiled binaries or docker images ready-to-use, while working on those blog posts I’ll slowly fill up that gap.

If you’re interested in the source code, you can find it in this github repository.

Q: What is missing in Cinode?

There are few crucial parts, each greatly increasing the usefulness of the project:

  • DHT-based communication network based on Kademlia - currently nodes have to talk to each other through HTTP and must know their addresses. This is not necessary
  • Cinode API on the web proxy - without it it is hard to write any useful dynamic real-world application
  • Some example applications to show the power of Cinode
  • More blob types - like inbound queues (for things like mail inboxes), some real-time notification channels etc.

Those are just few things that come to my mind. Whenever I finish some milestone I’ll pick up the next that in my opinion brings the most value to the system relatively to the cost of implementation.

Q: Any crazy ideas?

Well, few at least. We already discussed using pigeons, didn’t we 😆?

But other than that - I see a lot of potential in blob propagation methods, like using light and camera, or sound. Stenography also looks interesting here.

Other ideas I’ll keep for myself for now 😉.

Q: Do you want to build company around it? Get some funding? Start startup?

I’m not yet sure about the future of Cinode but I don’t see starting a company as something I would like to do myself. Building startup is much harder than it looks like. It takes a huge effort, requires a ton of money, hiring people, promoting, using connections. It’s hard to maintain good work-life balance. I have other things in my life that I also want to focus on. Some of them more important than Cinode.

Also once money comes in priorities may drastically change. It may no longer be about innovation but about ROI or financial stability and obligations. That could boost the project but also push it in the wrong direction.

Q: How about monetization though?

I can see at least few ways to monetize. But it may be hard, let’s face it. Many monetization strategies rely on access to data and permission to use it. Cinode is all about protecting the data though.

Personal

Q: How Cinode was created?

I started Cinode long time ago, don’t remember exactly but would be more than 10 years already. It was my playground. Wanted to learn the Go language and experiment with cryptography. Got inspired by few projects, started thinking and putting things onto design and started to write the code.

Q: 10 years ago? What took it so long?

Long breaks between phases, doing it in my spare time, other personal projects taking over… just usual stuff.

Q: Will there be timely updates in the future?

Hard to say. But looking at the history fo this project - unlikely. I had a bit more time recently to focus on Cinode and managed to push it forward, but that’s more of an exception than a rule. But if you like this project, let me know. It encourages to allocate a bit more time here.

Q: How would you define Cinode success?

For me it’s already a success - I learned a lot while designing and implementing it, and that was my main goal.

It’s not the end though, the next step would be to make it more useful in practical sense, allow building more complex and dynamic applications with Cinode, not only for static blogs.

After that it would be about wider adoption - that’s something I would consider a success for Cinode, not sure if a success for myself.

Q: Do you rule out the possibility of wide adoption then?

No. I’m open for different outcomes. Cinode may forever be an unknown technology that nobody uses or become a huge popular thing used by many. I can accept both and outcomes in the middle too. Let’s wait for the future.

Q: How can I contact you?

You can find my email in my git commits. And the source code is in this github repository.