How the Airbnb of Data Storage Plans to Take on Amazon S3

Music. Videos. Pictures. Games. Snaps. Tweets. Statuses. As a civilization, the amount of data we create each year is increasing at an exponential rate, driven by the proliferation of new apps, new formats, and ubiquitous recording devices. The data may come from almost innumerable sources, but today, a small number of companies handle the majority of its storage.

Among these, Amazon’s S3 cloud storage dominates, with more than one third of the market share for cloud infrastructure services globally. That’s why for the growing movement in favor of web decentralization, data storage is a critical area to address, a challenge that a handful of startups is currently trying to tackle.

Storj is one of them: a decentralized cloud storage solution targeted squarely at web developers. On October 30, Storj released a public alpha version of the new platform, version 3.0—now available through the Storj GitHub—and a white paper outlining many of the design choices that were made in building the components of the new release.

"We say it's Airbnb for your disk drive, where you can rent out extra hard drive space and get paid for it."

BREAKER caught up with Shawn Wilkinson, founder of Storj and currently its chief strategy officer, to talk about how the product has evolved since version one, why Dropbox is a front end for Amazon, and what it will take to bring web decentralization to the mainstream.

We just published an explainer on web decentralization, and it seems like there’s a lot of prior work needed to describe what it is and why it’s important. How do you explain Storj to someone without a technical background?
I’ll start with the first part. The best way to explain things is often by analogy, so we say it’s Airbnb for your disk drive, where you can rent out extra hard drive space and get paid for it. People seem to get that pretty easily. Then we transition to say that we’re mainly focused on cloud storage for applications, because the developers are the ones who really make the choices about cloud storage for us. For years and years, something like Dropbox stored all its data on Amazon S3. Applications we know and love are mostly using platforms like S3 to store our data. That’s why we focus on being a new platform for those applications.

So, you launched Storj in 2014. The public alpha version 3.0 was just released. What have been some of the milestones in between?
In 2014 we started with, I’d say, a naively optimistic approach. We said at the time, we’ll make a distributed Dropbox and that will be wonderful and great. When we learned about the market, we saw that Dropbox was sending all their data to Amazon S3—really they were like a front end. So we thought, if the people making decisions about cloud storage are developers, we need to make a developer-focused platform, and enable people to build out these applications, rather than essentially being another product.

We built the early version one of the network in 2014-15. We launched version two at the beginning of last year. And we did have a lot of people using the network, but it wasn’t getting the scale we needed. You see that with a lot of decentralized projects: ‘OK, it works, but there’s no way we can accommodate millions of users right now.’

For version three, we re-architected the system, took a lot of feedback from users, and figured out what we wanted to change. Two of these things, briefly: First, we optimized how we’re doing replication. If you’re storing data on people’s computers all around the world, at any time some might go offline, so you need multiple copies. In the version two network, there was about 8x or 9x replication, but in the v3 network, we could bring that down to 2-3x. So, we got huge efficiencies there.

Second thing, in v2 we had our own libraries that would be used to store data on Storj, but it took users a while to get used to, maybe a few hours or a few days to integrate. With v3, we really wanted to reduce the friction, so we made it compatible with Amazon S3, the gold standard of how centralized applications store data. We’ve given the v3 alpha to our partners and they’re like, ‘we made it work in 10 minutes.’

A lot of the people involved in building or theorizing the decentralized web come at it from an ideological angle just as much as a technological one. Do you feel a sense of social purpose in what you do?
You’re right that a lot of the root concerns of blockchain and cryptocurrency, the roots that Storj came from, are ideological: ‘We want more privacy, we want more security, we want control over our data and we don’t want governments and companies looking through it.’ One of the things we really try to focus on is, how can we put that into practice? How can we build a product that will respect the privacy and security of our users, but it doesn’t take 31 steps to get there. It has a good user experience.

Some of these things are easy to do: We want people to have privacy and control over their data, and we’re storing data on multiple untrusted drives all over the world. So it’s a simple decision to encrypt the data client-side and only give users the keys. We have to do it that way, and it provides a huge benefit for security.

Some things are more difficult in terms of architecture and on-boarding. For example, we’re launching the version three network next year, and we want to make sure the people on the network initially aren’t running it on a laptop that they’re going to turn off half the time, over a dial-up connection. How do we make sure we get the people we want, early in the network? We could have some sort of a wait list and filter through it, and automate the process later on. We want to be practical, but we also don’t want to take shortcuts.

"I think when you're building something disruptive, it doesn't move the needle to be more ideologically pure."

Do you think the adoption of decentralized applications will come from market demand, or from startups supplying products that might not obviously be decentralized?
I think when you’re building something disruptive, it doesn’t move the needle to be more ideologically pure. If that was the case, Uber wouldn’t be around, and Lyft would be the biggest ride-sharing company. But that does bring in interest: There are data breaches at large companies, or an Amazon S3 outage that takes out a quarter of the internet, and people take notice.

What you really need to sell people on is having tangible benefits versus centralized counterparts. You start with something like, “It’s more private and secure,” and people go “OK, that’s interesting.” And then you drop into, “Oh, and it’s half the cost,” and then of course it’s even more interesting to them.

If you’re distributing data all over the world rather than putting it in a $600 million data center in rural Kansas, you can get a lot more performance out of it. You start to stumble onto real benefits the product has over its counterparts. It might start with an ideological position, but real disruption is showing someone that what you’re making is better than what they’re already using, feature for feature. Because that’s when they’ll switch.

This interview has been edited and condensed.