Metadata Management

Over the past four decades, the Yang Center has acquired the equivalent of nearly a thousand years of recordings of the natural world, as we usually deploy multiple recorders running 24/7 for months at a time. Keeping track of hundreds of projects and deployments and that amount of sound data is not an easy task. Properly archived and accessible sound data serves both current analyses and can be combined for long-term meta-analyses.
We manage all our projects, data archives, and equipment with a FileMaker relational database called “CCB Metadata” developed in-house. This coordinates with a commercial file cataloging program (abeMeda). Specific information includes:
- General project info (sponsor information, principal investigators, etc.)
- Deployment information (crew, locations, etc.)
- Recorder details (gain settings, media IDs, start and end times, issue tracking, etc.)
- Locations and detailed contents of archival data on hard drives, servers, and digital tape
- Equipment (computers, recording gear, etc.)
Each deployment record summarizes the details for each recorder and each recording site is automatically plotted on a Google map embedded in the database. The adjacent image shows a typical deployment record.
The database is hosted on a server in the Yang Center. Staff members on Windows, Macs, and iPad can all access the database to add, update, or search for the information they need.
As of October 2024, the FileMaker database holds:
- 602 unique project entries (many projects contain multiple deployments)
- 564 deployment records (usually involving multiple recorders)
- 5,256 individual recording units (terrestrial and marine)
- 8,700 records of archival objects (locations of discrete data sets on hard drive, tape, cloud, or NAS)
- 3,700 hard drives and tapes
- 1,538 equipment items (computers, monitors, etc.)
The cataloging program holds 2,416 catalogs documenting 4.4 petabytes of data, in 185 million files. Each sound file is scanned when cataloged to capture details such as duration, sample rate, and channel count. This software then provides a powerful search capability to find files meeting precise criteria.