Digital Repositories at Scale that Invite Computation

Purpose

Our goal is to prove the performance of digital repository architectures under emerging next-generation workloads. We do this by the building and testing various implementations of the Linked Data Platform (LDP), Memento, and Fedora 5 family of repository specifications. Testing workloads are based on systems requirements gathered in early 2018 from our institutional partners.

Studied Systems

Institutional Partners

Publications

Use Cases

  • Digital Manuscripts Collection
  • Big Scientific Dataset
  • Citizen Science Image Curation
  • Social Media Archives

Performance Workloads

  • Large file ingest
  • Deeply nested collection ingest and update
  • Very large folders ingest and update (AKA "Supernode")
  • Incremental scaling of storage nodes
  • Dynamic scaling of front-end nodes
  • Pervasive metadata and feature extraction (Brown Dog)
  • On-demand conversion to access formats (Brown Dog)

Test Results