Citation:
Citation_Information:
Originator: Maria C. Figueroa Matias
Publication_Date: 20251215
Title:
Machine Learning Model: Estimates of Metal Abundance in Global Seafloor Massive Sulfide Deposits
Geospatial_Data_Presentation_Form: Model
Series_Information:
Series_Name: Data Release
Issue_Identification: DOI:10.5066/P13PYBJL
Publication_Information:
Publication_Place:
Pacific Coastal and Marine Science Center, Santa Cruz, California
Publisher: U.S. Geological Survey
Other_Citation_Details:
Suggested Citation: Figueroa Matias, M.C., 2025, Machine Learning Model: Estimates of Metal Abundance in Global Seafloor Massive Sulfide Deposits: U.S. Geological Survey data release,
https://doi.org/10.5066/P13PYBJL.
Online_Linkage: https://doi.org/10.5066/P13PYBJL
Description:
Abstract:
A multi-stage ensembled machine learning model was developed to estimate metal abundances in seafloor massive sulfide deposits worldwide. The modeling framework integrates (1) KMeans++ clustering to identify geochemical groupings based on enrichment controls, (2) Random Forest classification to assign geochemical labels to vent fields with incomplete or absent geochemical data, and (3) XGBoost regression to generate high-fidelity predictions of metal concentrations. This USGS model application data release includes all scripts, input files, and output files necessary to apply the model to estimate concentrations of cobalt, gold, and zinc. This model is not limited by spatial boundaries and is intended for application to any oceanic location with appropriate input data.
Purpose:
The purpose of this data release is to provide a machine learning framework and supporting files developed to estimate cobalt, gold, and zinc concentrations in seafloor massive sulfide (SMS) deposits. The model supports efforts to better understand geochemical variability and metal enrichment in SMS systems and to improve deep-sea mineral resource assessments across diverse tectonic settings.
Supplemental_Information:
See SMS-MetalML_reference-list.pdf for details on all external sources used in this work. See the README.md files for additional information on the operating system and software versions used to develop this model, the directory structure, and files not listed here in the metadata. Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government.
Time_Period_of_Content:
Time_Period_Information:
Range_of_Dates/Times:
Beginning_Date: 19761122
Ending_Date: 20240621
Currentness_Reference: publication date of data used to train the model
Status:
Progress: Complete
Maintenance_and_Update_Frequency: None Planned
Spatial_Domain:
Bounding_Coordinates:
West_Bounding_Coordinate: -180
East_Bounding_Coordinate: -180
North_Bounding_Coordinate: 90
South_Bounding_Coordinate: -90
Keywords:
Theme:
Theme_Keyword_Thesaurus: USGS Thesaurus
Theme_Keyword: modeling
Theme_Keyword: critical minerals
Theme_Keyword: mineral resources
Theme_Keyword: sea-floor characteristics
Theme_Keyword: marine chemistry
Theme:
Theme_Keyword_Thesaurus: ISO 19115 Topic Category thesaurus
Theme_Keyword: geoscientificInformation
Theme_Keyword: oceans
Theme_Keyword: economy
Theme:
Theme_Keyword_Thesaurus: None
Theme_Keyword: U.S. Geological Survey
Theme_Keyword: USGS
Theme_Keyword: Coastal and Marine Hazards and Resources Program
Theme_Keyword: CMHRP
Theme_Keyword: Pacific Coastal and Marine Science Center
Theme_Keyword: PCMSC
Theme_Keyword: machine learning
Theme:
Theme_Keyword_Thesaurus: USGS Metadata Identifier
Theme_Keyword: USGS:67f010a4d4be02766d636810
Access_Constraints:
No access constraints. Acknowledgment of the U.S. Geological Survey would be appreciated in products derived from this model application release.
Use_Constraints:
USGS-authored or produced data and information are in the public domain from the U.S. Government and are freely redistributable with proper metadata and source attribution. These data are licensed under CC BY 4.0 and users must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use. Please recognize and acknowledge the U.S. Geological Survey as the originator(s) of the dataset and in products derived from these data. Although the information contained in the model files may be useful for other purposes, it is incumbent on the user to understand the purpose, construction, and limitations of this model. This information is not intended for navigation purposes.
Point_of_Contact:
Contact_Information:
Contact_Organization_Primary:
Contact_Organization:
U.S. Geological Survey, Pacific Coastal and Marine Science Center
Contact_Person: PCMSC Science Data Coordinator
Contact_Address:
Address_Type: mailing and physical
Address: 2885 Mission Street
City: Santa Cruz
State_or_Province: CA
Postal_Code: 95060
Contact_Voice_Telephone: 831-427-4747
Contact_Electronic_Mail_Address: pcmsc_data@usgs.gov
Native_Data_Set_Environment: Windows 11 Enterprise Version 23H2, Python 3.11.4.