Metadata Management

Over the past four decades, the Yang Center has acquired hundreds of years of recordings of the natural world, as we usually deploy multiple recorders running 24/7 for months at a time. Keeping track of hundreds of projects and deployments and that amount of sound data is not an easy task. Properly archived and accessible sound data serves both current analyses and can be combined for long-term meta-analyses.

We manage all our projects, data archives, and equipment with a FileMaker relational database called “CCB Metadata” developed in-house. This coordinates with a commercial file cataloging program (abeMeda). Specific information includes:

  • General project info (sponsor information, principal investigators, etc.)
  • Deployment information (crew, locations, etc.)
  • Recorder details (gain settings, media IDs, start and end times, issue tracking, etc.)
  • Locations and detailed contents of archival data on hard drives, servers, and digital tape
  • Equipment (computers, recording gear, etc.)

Each deployment record summarizes the details for each recorder and each recording site is automatically plotted on a Google map embedded in the database. The adjacent image shows a typical deployment record.

The database is hosted on a server in the Yang Center. Staff members on Windows, Macs, and iPad can all access the database to add, update, or search for the information they need.

As of April 2023, the FileMaker database holds:

  • 577 unique project entries (many projects contain multiple deployments)
  • 522 deployment records (usually involving multiple recorders)
  • 4,525 individual recording units (terrestrial and marine)
  • 7,500 records of archival objects (locations of discrete data sets on hard drive, tape, cloud, or NAS)
  • 3,300 hard drives and tapes
  • 1,200 equipment items (computers, monitors, etc.)

The cataloging program holds 2,200 catalogs documenting 2.7 petabytes of data, in 165 million files. Each sound file is scanned when cataloged to capture details such as duration, sample rate, and channel count. This software then provides a powerful search capability to find files meeting precise criteria.