Innovation Funds

Holon Global Investments - Active Fund Manager

Holon Photon Fund

Holon Global Investments - Active Fund Manager

Insights

Preserving History: USC’s Innovative Approach to Archival Data with Filecoin Decentralised Storage

Published 21 Nov 2023

In a recent spotlight at FIL Vegas, a Filecoin community event, Sam Gustman, Chief Technology Officer at the University of Southern California (USC) Libraries, and the USC Shoah Foundation showcased the pivotal role of Filecoin decentralised storage in preserving historical data integrity. FIL Vegas attendees, including Holon’s Managing Director Heath Behncke and Director of Technology & Innovation Jonathon Hooker, witnessed how Filecoin decentralised storage adds innovative value to archival data preservation.

Gustman, with over 27 years of experience at USC, manages a vast storage capacity of over 60 petabytes, including 8 petabytes dedicated to video content. His pioneering approach has the potential to revolutionize how universities worldwide store valuable archival data.

The USC Shoah Foundation, initiated by Steven Spielberg after “Schindler’s List“, focuses on preserving testimonials related to the Holocaust. Housing 25 petabytes of audiovisual interviews in 45 languages from 65 countries, the Foundation documents events beyond the Holocaust, encompassing events like the Armenian Genocide, Nanjing Massacre, the Cambodian Genocide, and most recently the Hamas Terror attack in Israel, making USC’s data preservation a colossal challenge.

Source: The Shoah Foundation

The core challenge for USC lies in ensuring the enduring preservation of this valuable data. Gustman emphasized the critical distinction between digitisation and preservation, highlighting the various threats that can compromise digitised content. Effectively safeguarding this collection demands a comprehensive data management plan.

As Sam Gustman points out, “The unfortunate reality is that everything decays over time. Conservative estimates indicate a lifespan of 50 years for film, 20 years for videotape, 5 years for hard drives, and 3 years for Linear Tape-Open LTO”, a magnetic tape storage. “Interestingly, the newer the technology, the faster the media deterioration. Detecting and guarding against this decay necessitates creating multiple copies stored in diverse locations and on different technological platforms.”

Source: filecoin.io

There are other issues, like entity risk. Gustman states “organizational problems, an Enron type event, where we watched the head of Amazon fighting with the president a number of years back, who knows where that can lead if you have your stuff stored in the cloud, so you can have organizational collapse.”

Reading and writing have spanned 5,000 years, with enduring works like the Bible, the Quran, the Torah, and Shakespeare standing the test of time for centuries and becoming what is known as a ‘super text’. However, digital files and moving images have only been part of our lives for a mere 140 years. The question arises: how can we elevate these newer forms of content to the status of a ‘super text’?

From the initial capture to long-term preservation, ensuring access, and meaningful use, the challenge lies in safeguarding the integrity of the original asset. In the realm of artificial intelligence, such as the utilization of neural Radiance Fields Nerfs for transforming content into 3D, a methodology employed by the Foundation and shown below, preserving the authenticity of the original becomes paramount. How can we validate that the content hasn’t undergone unauthorized alterations? How do we guarantee and demonstrate that the original images used to construct a 3D environment are indeed genuine and accurately represent their origin?

Source: USC Shoah Foundation

Gustman’s approach to address these challenges is to proliferate content across diverse platforms and technologies, engaging a spectrum of organisations from non-profit to for-profit and governmental. The key, it seems, is ubiquity – placing the content in as many contexts as possible to fortify its standing as a ‘super text.’

At USC, their data storage management plan has evolved over time to engage different kinds of preservation storage methods. The foundation layer is their extensive tape libraries, dispersed globally and kept synchronised. Tapes are recognized for their robustness and immunity to ransomware, albeit with a trade-off of slower data retrieval speeds. To augment their capabilities, the institution then added cloud storage, integrating Azure, capitalizing on its efficiency in rapidly copying petabytes of data worldwide—a task conventionally challenging with tape systems.

The most recent addition to the ongoing evolution of USC’s storage strategy is to introduce Filecoin decentralised storage on Web 3.0 into their data storage stack. This integration facilitates the implementation of content addressable storage, ensuring a seamless flow from the camera, to preservation, to access. This comprehensive approach guarantees the authentication and integrity of the original asset throughout its lifecycle. Whether in legal proceedings or for other critical purposes, this strategy instils confidence in the preservation of the collection’s integrity, offering a verifiable assurance of its unaltered state.

Today, as the USC collects new content, all facets of the content, including its origin, collection timestamp, and the recipients are collected. This comprehensive data set is then transformed into a verifiable asset within Filecoin decentralised storage, ensuring its integrity can always be proven.

Filecoin decentralised storage is a blockchain storage system and achieves this via a cryptographic hash function, which generates an unalterable identifier known as the “content address” for each piece of content. This ensures that any attempt to manipulate the file will result in a new identifier, underscoring the integrity of content addressable systems.

By leveraging these strategies and utilising new technologies like Filecoin decentralised storage, the team at USC guarantees that content is securely stored with integrity, making it accessible globally and preserving it for future generations.

Gustman went on to discuss data management and AI, “proving that the background data was the original asset is going to be a big part of what happens with AI and academic collections.” So, while AI delivers outputs, its value lies in understanding the provenance of the original datasets. This necessitates concrete, unaltered content address identifiers, coupled with comprehensive metadata enveloping these assets. Such meticulous documentation is essential to substantiate that the generated output is rooted in genuine, unaltered sources.

USC has recently established the first academic Filecoin decentralised data node within a university setting. Gusman emphasized the university’s commitment to integrating academic content and developing the essential infrastructure, policies, and procedures. The initiative aims to enable academics and academic collections to seamlessly utilise Web 3.0 data storage across diverse university environments. USC is thereby laying the groundwork for academic institutions globally to leverage Filecoin decentralised storage for their benefit.

In conclusion, the integration of Filecoin’s decentralised storage into the archival data preservation efforts of the USC marks a significant leap forward in the quest to safeguard historical integrity in the digital age. Sam Gustman’s pioneering approach not only addresses the challenges posed by the decay of various storage media but also confronts organisational risks and the complexities associated with preserving diverse and expansive collections. With the establishment of a Filecoin decentralised storage node, USC sets the stage for global academic institutions to harness the value of decentralised storage on the Filecoin Network.

 

Disclaimer: This Article has been prepared by Holon Global Investments Limited ABN 60 129 237 592. Holon Global Innovations Pty Ltd (“HGI”) is a wholly owned subsidiary of Holon Global Investments Limited (together “Holon”). HGI is a Filecoin (FIL) Storage Provider and is positioned as a major player in the FIL decentralised data storage arena. FIL Storage Providers are rewarded in FIL for the provision of data storage capacity. Holon, its officers, employees and agents believe that the information in this material and the sources on which the information is based (which may be sourced from third parties) are correct as at the date of publication. While every care has been taken in the preparation of this material, no warranty of accuracy or reliability is given and no responsibility for this information is accepted by Holon, its officers, employees or agents. Except where contrary to law, Holon excludes all liability for this information.

Recommended Articles

Here comes Filecoin: a mid-year report

In the ever-evolving landscape of blockchain technology, Filecoin has emerged as a formidable player in decentralized storage. As we pass the halfway point of 2023, Filecoin’s progress and impact continue to captivate industry insiders, tech enthusiasts – and even...