Data has been described as the new oil. But it is now literally gushing everywhere. And that’s only going to get worse with the rise and rise of electric vehicles (EVs) which will create a global data storage crisis, overwhelm even the giants like AWS, and require radical new storage models.
With Covid-19 lockdowns accelerating a shift into the digital world, the global datasphere (total data created in a single year) hit 64 ZB in 2020, up 400% since 2016. The leading data forecaster, IDC (International Data Corporation), now forecasts the data will surge to 2,140 ZB by 2035 (Figure 1).
Figure 1. IDC’s annual global datasphere forecasts 2010-2035
But we believe that even this forecast significantly underestimates the amount of data the world will produce. Data is set to undergo a revolutionary change over the next decade with the explosive growth of IoT (internet-of-things) and autonomous transportation.
IoT, which allows electronic devices to communicate with each other and be controlled from a single platform, is creating a massive surge in data. Minute IoT devices are being placed on everything from fridges and cars, right through to critical infrastructure like pipelines and bridges, sending constant performance updates that can alert engineers when problems first begin to occur.
Seven times larger
But the biggest, and least appreciated, additional pressure on data storage capacity comes with the shift in the data source towards devices and EVs, which are expected to produce over 90% of all data by the 2030s.
Autonomous transportation is likely to reach Level 4 (where the vehicle can handle all aspects of driving under most decisions) by the mid-2020’s and should be widely available across most EV models by 2030.
Our recent Tesla Report forecast that Tesla’s autonomous EV fleet will reach 110 million vehicles by 2035. If we extrapolate this across the entire global EV fleet, we reach a total of 400 million vehicles. Each autonomous EV is expected to generate up to 40 GB (gigabyte) of data per second, according to Intel’s ex-CEO Andy Grove.
With a fleet of 400 million vehicles in 2035, Holon estimates autonomous EVs will produce between 10-15,000 ZB of data. Add in the data from IoT devices, and all other data sources, and our datasphere estimates soar above 15,000 ZB in 2035 (Figure 2). This is almost seven times larger than the IDC’s forecasts.
Figure 2. Holon datasphere forecasts 2020-2035
How do we store it?
This deluge of data will create a huge new problem: storage. How do we sort the ‘wheat-from-the-chaff’ and save the most valuable data while throwing away the rest?
At the end of 2020, global data center storage capacity sat at just 2.1 ZB. Without a substantial increase in global storage capacity and infrastructure, our ability to store just a fraction of our annual datasphere is becoming more difficult.
Holon believes this issue is substantially underappreciated by investors, with the problem so large that it will change the foundations for how the internet is managed today.
IDC estimates that global data storage capacity will double to 13.2 ZB in 2024, up 18% per annum from 2020. If we assume that end-user storage capacity (data stored in the likes of mobile phones) continues to grow by 10% per year (to reach 6.9ZB in 2024), it leaves almost 50% of capacity in data centers across the world.
To achieve this target, data center storage must grow by 32% per year to 2024. That would require a higher annual growth rate of 32% from IDC’s 2020 data center capacity estimate of 2.1 ZB.
If we extend these same growth assumptions out to 2035, we will see global data storage capacity reach almost 82 ZB, almost 12x larger than today’s footprint.
But that is not enough.
Data storage capacity plummets
In Figure 3, we’ve taken each year’s estimate of total data storage capacity (from 2020 to 2035) and divided it by IDC’s total datasphere estimate since 2010 for that year. This allows us to determine how much of our data we have the storage capacity to retain.
Figure 3. Global data storage capability since 2020
The results are startling. Even after assuming a 40-fold increase in our global data storage capacity by 2035 to 82 ZB, our ability to save our data will fall dramatically.
In 2020, we had the capacity to save 2.9% of all data we have created since 2010 (referred to in this article as our ‘total storage capability’). The introduction of both IoT and autonomous EVs will drive a massive increase in data volumes, lowering our global storage capacity to 1.2% in 2030 and just 0.8% in 2035.
Making decisions about which data to keep and which to discard is therefore critical. The introduction of smart contracts that facilitate near zero-cost micro-payments through automatic digital payments and access rules will allow individual data files to earn income that was previously uneconomic to manage. This could alter the equation and make more types of data valuable. With a 72% loss of total data storage capability expected in IDC’s forecasts, storage providers must substantially increase their investment into data storage capacity to meet any further loss of valuable data.
Holon’s datasphere scenario
An alternate scenario sees autonomous EVs and IoT delivering substantially higher data production levels than IDC’s estimate. Holon believes that the global datasphere will comfortable surpass 15,000 ZB by 2035 and will require a massive increase in global data storage capacity by over 26% per year to 230 ZB by 2035 and 1,000 ZB by 2040.
Running the same exercise as above to determine our data storage capabilities at these substantially higher (and in Holon’s view most likely) data production levels also provide startling results. Despite growing our data storage in this example by over 100-fold in the next 15 years to 230 ZB and 1000 ZB by 2040, our storage capabilities rapidly fall to below just 0.5% of our total datasphere produced since 2010 (Figure 4).
Figure 4. Global data storage capability since 2010
This leaves us with an important decision that few investors have ever considered – How do we ensure we have adequate data storage capacity to keep important data over the very long-term?
Holon believes that the world is massively underestimating the volume of data we create as well as our storage requirements over the next few decades from IoT and autonomous EVs. As outlined in the example above, even after expanding our global data storage capacity by 30-fold in just 15 years, we will still need to discard 99.6% of our total datasphere since 2010.
Holon’s long-term estimate of 1,000 ZB global storage capacity by 2040 (a 150-fold increase of over our capacity today) also looks highly conservative. For example, if the world required 1% of total data production to be stored, we would need to storage of 500 ZB by 2035. This would require annual storage capacity growth of 33% for the next 14 years, 7% higher than the growth rate required to hit 228 ZB in 2035.
We need to find an alternative investment model for data storage capacity
Considering that it has taken over US$2 trillion to build just 2 ZB of data center capacity, it should quickly become apparent that the largest global data providers today, which include AWS (Amazon), Azure (Microsoft), Google and Alibaba can no longer afford to fund the cost of building the volume of data storage capacity outlined. Holon expects it will cost over US$100 trillion to reach 1000 ZB by 2040.
Is this good or bad for the big cloud players as investments?
An alternate approach to fund the cost of global storage infrastructure is a Web. 3.0 project called ‘Filecoin’, which dramatically lowers the cost of data storage as well as provides a reward protocol to storage providers that bring capacity and customers to the Filecoin storage platform.
Depending on the market price of Filecoin tokens, the rewards could provide additional capital to fund the cost of storage infrastructure. Look out for more on this subject in our next white paper on data which will be released in Q1 2022.