Why every data scientist should pay attention to crypto Hint: it is not what you think

Photo by Zoltan Tasi on Unsplash

Contrary to what you might be thinking, I am not here to tell you that you can use machine learning to figure out which meme-inspired crypto token is poised to break out, or that you can use deep learning based trading strategies to generate a ton of alpha. Those things might be true, but it is not the reason why I am personally excited by crypto, and it is not why I think you should be as well. My excitement is driven by 3 facts:

  1. Crypto is becoming an increasingly important platform for consumer products
  2. Decentralized crypto applications create a ton of data that are granular, verifiable and publicly available
  3. Effectively leveraging this data asset is key to realizing crypto’s potential

I believe this will create a huge opportunity for data science expertise in the crypto community.

A quick note on terminology

Crypto in the context of this post refers specifically to decentralized, smart contract capable blockchains and the applications built on top of these blockchains. The Ethereum and Solana blockchains are well known examples of this type of crypto ecosystems. Notably the Bitcoin blockchain is not, as it has limited functionalities when it comes to building smart contract applications. If this sounds foreign to you, a simple mental model for what I am referring to is an ecosystem where blockchains serve as distributed computing backends, whose state is driven by consensus of the participants on the network, and these backends are used, via smart contracts, to build user facing applications.

Crypto is becoming an increasingly important platform for consumer products

When looking at the crypto space, it is hard not to see the wild gyrations of token prices and myriad of meme inspired projects as one giant speculative casino with no real value. To some extent, this is true. However, this distracts us from the underlying technology that has enormous potential to enable the next generation of consumer applications. As many have argued [1][2][3], more eloquently than I can, that blockchain-backed decentralized applications may provide effective solutions to the many problems associated with the consumer apps we have today. In particular, a handful of powerful platforms, like Facebook and Google, have effective monopolistic control over consumer’s digital lives. They can elect to censor in fairly arbitrary ways, and in service of their own interest— even to the detriment of users. They also extract nearly all of the economic value from the ecosystem, at the expense of the participants who create most of the value.

Decentralized blockchain networks enable a different paradigm by offering a distributed computational framework that anyone can use to build decentralized consumer facing applications. These applications, also known as dApps or smart contracts, live and execute transparently on the blockchain, and provide frameworks to obviate the middle man, and put both the governance and economic value back in the hands of consumer and producers who participate in the ecosystem.

If this sounds like a far-fetched ideal, it mostly is today. There are still many performance, governance, and just plain implementation issues to sort out. This website offers an entertaining perspective on the various failed attempts at building crypto applications. However, it is also undeniably gaining traction, with many dApps achieving truly impressive scale and adoption across a variety of consumer domains. For example

  1. Uniswap, a decentralized exchange on the ethereum blockchain, allows users to trade cryptocurrencies without centralized market makers or centralized platforms that take custody of the assets. As of the writing of this post, it does about $1.5B of transaction value per day, which is about 40% of that of the centralized cryptocurrency exchange Coinbase [4], arguably the most popular cryptocurrency exchange in the US.
  2. Opensea, a marketplace that operates primarily on the ethereum and polygon blockchains, allows users to discover and trade NFTs (ownership over unique digital assets). In August of 2021, ~220K users traded NFTs that are collectively worth about $3.5B [5]. To provide a relative sense of scale, Etsy and EBay did about ~$3B and ~$22B of gross transaction values, respectively, in all of Q2 2021 [6]. Just looking at August alone, Opensea has blown past Etsy, and is roughly half the scale of EBay, which is truly impressive for any marketplace, decentralized or not.

These applications, along with many others like Audius in music and Axie Infinity in gaming, are clearly pushing the boundaries of what it means to be a consumer product. I would even argue that many of them are already important consumer products that have large, loyal user bases, despite their many rough edges. We are still at the beginning of the adoption and growth curve, and as blockchain technology continues to improve, I believe we will see more and more of these decentralized consumer products emerge and thrive.

Decentralized crypto applications create a ton of data that are granular, verifiable and publicly available

Compared to traditional applications, one very notable feature of the decentralized applications is their data exhaust. Due to the nature of the blockchain technology, many of the most important user actions are recorded transparently on the blockchain ledger. Take for example Uniswap, one can easily see the details of every single individual cryptocurrency trade, which looks like this

A uniswap transaction that traded ~0.386 ethereum for 1,199 NWC tokens, on Etherscan

… one can also see when specific trading pairs are created, and how much liquidity is available for trading.

The creation of the Eth-NWC trading pair pool, on Etherscan

In effect, every single value-creating activity within the Uniswap product is public and transparent, as they are happening, for anyone who knows where and how to look. These records are also independently verifiable, and cannot (or incredibly difficult to) be changed by anyone after block confirmation. This is true of all decentralized applications built on the blockchain, regardless of what the unit of economic activity may be. It could be a trade, or it could be a stream of a song, or even a strike by an avatar in a game. This represents a radically different paradigm compared to products we know today, where at best, you can get aggregated metrics from publicly traded companies, a quarter after the value creating activity has occurred. And in the case of private companies, there is often no visibility at all.

With more crypto consumer applications gaining traction, the growing trove of publicly available data will fundamentally change how the next generation of products compete and operate.

Effectively leveraging this data asset is key to realizing crypto’s potential

Imagine if all of Robinhood granular trade data were publicly available. What would that mean? It would vitiate Robinhood’s pay for order flow business model, which accounts for the lion share of their revenue. Similarly for Facebook’s data on consumers, if it were publicly available, they would not be able to monetize the way they do today.

If we really think about this, many of the platform companies we use today extract rent by dis-intermediating consumers and producers through their proprietary access to data and distribution channels. A transparent data ecosystem, like the one created by dApps built on the blockchain, makes it nearly impossible to rent seek in this way. This is simply because anyone with the right expertise can access the data, use it to discover the fact that some product or service is extracting more value than it is providing, and build a better alternative to capture that value and eliminate the inefficiency. This is analogous to the concept of arbitrage in stock markets, where, because of fairly transparent and standardized data, every opportunity to make free money will be exploited and eliminated. I would argue that if this were true in the consumer application world, we would actually have greater accountability, fairer competition, better products, and a more equitable share of the value for the consumers and producers.

If all of this sounds abstract, allow me to provide a more concrete example. The chief product officer at OpenSea, the crypto NFT exchange, was recently asked to resign because a user of OpenSea discovered that he was unfairly profiting by buying NFTs moments before OpenSea promoted them on their website [7]. This is certainly a case of growing pains for OpenSea and for the broader crypto application ecosystem. However, it is also a profound, albeit small, example of how data transparency led to accountability for a bad actor that would have otherwise gotten away with it. An enterprising individual, whether she realized it or not, did some excellent data science to figure out what was happening and sounded the alarm. The utility of this data is not limited to discovering bad actors. With full visibility into every NFT, every transaction on the Opensea platform, and how the trading mechanism is implemented on the blockchain, it is very possible to create a competitor product if Opensea fails to provide adequate value for the fee it is charging. This creates an accountability mechanism that ensures the interests of the consumer are being looked after. Of course, this is all contingent on effectively understanding and leveraging the data asset.

As a data scientist, this feels very exciting and like the beginning of something potentially paradigm shifting. It is clear that data transparency is a key feature of the fledgling crypto ecosystem, and a crucial tool to help enable the platform that can truly better serve the consumer. Data science practitioners will have an opportunity and even a responsibility in creating this new paradigm of transparency and accountability. Only by making crypto data more accessible and standardized, can we leverage the insights to help build the future.

Over the next few posts, I will be sharing more details on how to work with smart contract data, including an overview of the data structure, how to acquire the data, the tools for working with it, and some deep dive analysis examples. If you want to learn more about this, please follow me on Medium and Twitter so you can get the latest when I post my next article.

Feel free to reach out if you have comments or questions. Twitter | Linkedin | Medium

References

[1] From Web 1.0 to Web3: How the Internet Grew Over The Years

[2] The Value Chain of the Open Metaverse

[3] Internet 3.0 and the Beginning of (Tech) History

[4] https://www.theblockcrypto.com/data/decentralized-finance/dex-non-custodial

[5] https://dune.xyz/yifeihuang/Opensea

[6] https://decrypt.co/79789/opensea-3b-month-ethereum-nft-sales-amazon-ebay-etsy

[7] https://twitter.com/ZuwuTV/status/1437921263394115584

Comentarios

Entradas populares de este blog

The Impact of Blockchain Technology on Ecommerce

Unlocking the Power of Machine Learning: A Journey into the World of Artificial Intelligence.

Blockchain trends in 2022