<?xml version="1.0" encoding="UTF-8"?>
<metadata>
  <idinfo>
    <citation>
      <citeinfo>
        <origin>Daniel Buscombe</origin>
        <origin>Mark A. Lundine</origin>
        <origin>Sharon Batiste</origin>
        <origin>Catherine N. Janda</origin>
        <pubdate>20250325</pubdate>
        <title>Labeled satellite imagery for training machine learning models that predict the suitability of semantic segmentation model outputs for shoreline extraction.</title>
        <geoform>PNG</geoform>
        <serinfo>
          <sername>data release</sername>
          <issue>DOI:10.5066/P1N4VI7H</issue>
        </serinfo>
        <pubinfo>
          <pubplace>Pacific Coastal and Marine Science Center, Santa Cruz, California</pubplace>
          <publish>U.S. Geological Survey</publish>
        </pubinfo>
        <othercit>Suggested Citation: Buscombe, D., Lundine, M.A., Batiste, S., and Janda, C.N., 2025, Labeled satellite imagery for training machine learning models that predict the suitability of semantic segmentation model outputs for shoreline extraction, U.S. Geological Survey data release, https://doi.org/10.5066/P1N4VI7H.</othercit>
        <onlink>https://doi.org/10.5066/P1N4VI7H</onlink>
      </citeinfo>
    </citation>
    <descript>
      <abstract>A dataset of semantic segmentations of Landsat, Sentinel, and Planetscope satellite images of coastal shoreline regions, consisting of folders of images that have been labeled as either suitable or unsuitable for shoreline detection using existing conventional approaches such as CoastSat (Vos and others, 2019) or CoastSeg (Fitzpatrick and others, 2024). These data are intended only to be used as a training and validation dataset for a machine learning model that is specifically designed for the task of determining the suitability of a deep-learning-based image segmentation model output for the task of estimating the shoreline location. These data were used to train a Machine Learning model to recognize the quality of an image segmentation.</abstract>
      <purpose>These data provide resources for automatically detected coastal shoreline position for resource managers, science researchers, students, and the general public. These data can be opened with image viewing software and can be used within specialist software for the purposes of training Machine Learning models to identify suitable and unsuitable segmentation output imagery for the purposes of shoreline mapping. The imagery are organized into two folders; those for training and those for testing a Machine Learning model. These data were used to train a Machine Learning model to recognize the quality of an image segmentation.</purpose>
      <supplinf>This data release was funded by the USGS Coastal and Marine Hazards and Resources Program. Any use of trade, product, or firm names is for descriptive purposes only and does not imply endorsement by the U.S. Government. This data release contains modified Planetscope imagery, provided under the NASA (National Aeronautics and Space Administration) CSDA (Commercial Satellite Data Acquisition) program under the standard Scientific Use License available at https://cdn.earthdata.nasa.gov/conduit/upload/14226/PlanetEULA042220.pdf, and the End User license agreement available at https://earthdata.nasa.gov/s3fs-public/2022-02/Planet_Expanded_EULA_06-21.pdf. This license permits redistribution of imagery in significantly modified form. We provide only the visible (R, G, and B) bands of small sub-portions of downloaded tiles, in PNG format. As such, the original imagery (multispectral scenes in GeoTIFF format) would have been cropped, its geospatial information removed, and re-encoded into 8bit PNG format.</supplinf>
    </descript>
    <timeperd>
      <timeinfo>
        <rngdates>
          <begdate>1984</begdate>
          <enddate>2024</enddate>
        </rngdates>
      </timeinfo>
      <current>collection years of satellite imagery.</current>
    </timeperd>
    <status>
      <progress>Complete</progress>
      <update>None planned</update>
    </status>
    <spdom>
      <bounding>
        <westbc>180.00000</westbc>
        <eastbc>-180.00000</eastbc>
        <northbc>90.00000</northbc>
        <southbc>-90.00000</southbc>
      </bounding>
    </spdom>
    <keywords>
      <theme>
        <themekt>USGS Metadata Identifier</themekt>
        <themekey>USGS:2cef6b9f-2135-4f2c-8dbd-71c05f346ff1</themekey>
      </theme>
      <theme>
        <themekt>Global Change Master Directory</themekt>
        <themekey>Hazards Planning</themekey>
        <themekey>Ocean Waves</themekey>
        <themekey>Erosion</themekey>
        <themekey>Sea Level Rise</themekey>
        <themekey>Extreme Weather</themekey>
      </theme>
      <theme>
        <themekt>ISO 19115 Topic Category</themekt>
        <themekey>Oceans</themekey>
        <themekey>ClimatologyMeteorologyAtmosphere</themekey>
      </theme>
      <theme>
        <themekt>Data Categories for Marine Planning</themekt>
        <themekey>Physical Habitats and Geomorphology</themekey>
      </theme>
      <theme>
        <themekt>USGS Thesaurus</themekt>
        <themekey>Climate Change</themekey>
        <themekey>Storms</themekey>
        <themekey>Sea-level Change</themekey>
      </theme>
      <theme>
        <themekt>Marine Realms Information Bank (MRIB) keywords</themekt>
        <themekey>sea level change</themekey>
        <themekey>waves</themekey>
        <themekey>coastal erosion</themekey>
      </theme>
      <theme>
        <themekt>None</themekt>
        <themekey>U.S. Geological Survey</themekey>
        <themekey>USGS</themekey>
        <themekey>Coastal and Marine Hazards and Resources Program</themekey>
        <themekey>CMHRP</themekey>
        <themekey>Pacific Coastal and Marine Science Center</themekey>
        <themekey>PCMSC</themekey>
      </theme>
    </keywords>
    <accconst>No access constraints</accconst>
    <useconst>USGS-authored or produced data and information are in the public domain from the U.S. Government and are freely redistributable with proper metadata and source attribution. Please recognize and acknowledge the U.S. Geological Survey as the originator(s) of the dataset and in products derived from these data. Users are advised to read the rest of the metadata record carefully for additional details.</useconst>
    <ptcontac>
      <cntinfo>
        <cntorgp>
          <cntorg>U.S. Geological Survey, Pacific Coastal and Marine Science Center</cntorg>
          <cntper>PCMSC Science Data Coordinator</cntper>
        </cntorgp>
        <cntaddr>
          <addrtype>mailing and physical</addrtype>
          <address>2885 Mission Street</address>
          <city>Santa Cruz</city>
          <state>CA</state>
          <postal>95060</postal>
        </cntaddr>
        <cntvoice>831-427-4747</cntvoice>
        <cntemail>pcmsc_data@usgs.gov</cntemail>
      </cntinfo>
    </ptcontac>
    <browse>
      <browsen>Image_locations_segmentation_filter.png</browsen>
      <browsed>Image map showing locations of segmented satellite imagery.</browsed>
      <browset>PNG</browset>
    </browse>
    <datacred>This data release was funded by the USGS Coastal and Marine Hazards and Resources Program.</datacred>
    <native>The datasets were created in a Windows 11 Operating system, python 3.10. Results were output and saved in PNG format.</native>
    <crossref>
      <citeinfo>
        <origin>Buscombe, D.</origin>
        <origin>Goldstein, E.B.</origin>
        <pubdate>2022</pubdate>
        <title>A reproducible and reusable pipeline for segmentation of geoscientific imagery</title>
        <othercit>Buscombe, D., and Goldstein, E.B, 2022, A reproducible and reusable pipeline for segmentation of geoscientific imagery. Journal of Open Source Software, 9(99), 6683</othercit>
        <onlink>https://doi.org/10.1029/2022EA002332</onlink>
      </citeinfo>
    </crossref>
    <crossref>
      <citeinfo>
        <origin>Fitzpatrick, S.</origin>
        <origin>Buscombe, D.</origin>
        <origin>Warrick, J.A.</origin>
        <origin>Lundine, M.A.</origin>
        <origin>Vos, K.</origin>
        <pubdate>2024</pubdate>
        <title>CoastSeg: an accessible and extendable hub for satellite-derived-shoreline (SDS) detection and mapping</title>
        <othercit>Fitzpatrick, S., Buscombe, D., Warrick, J.A., Lundine, M.A., and Vos, K., 2024, CoastSeg: an accessible and extendable hub for satellite-derived-shoreline (SDS) detection and mapping. Journal of Open Source Software, 9(99), 6683</othercit>
        <onlink>https://doi.org/10.21105/joss.06683</onlink>
      </citeinfo>
    </crossref>
    <crossref>
      <citeinfo>
        <origin>Vos, K.</origin>
        <origin>Harley, M.D.</origin>
        <origin>Splinter, K.D.</origin>
        <origin>Simmons, J.A.</origin>
        <origin>Turner, I.L.</origin>
        <pubdate>2019</pubdate>
        <title>Sub-annual to multi-decadal shoreline variability from publicly available satellite imagery</title>
        <othercit>Vos, K., Harley, M.D., Splinter, K.D., Simmons, J.A., and Turner, I.L., 2019a, Sub-annual to multi-decadal shoreline variability from publicly available satellite imagery: Coastal Engineering, v. 150, p. 160-174.</othercit>
        <onlink>https://doi.org/10.1016/j.coastaleng.2019.04.004</onlink>
      </citeinfo>
    </crossref>
    <crossref>
      <citeinfo>
        <origin>Gorelick, N.</origin>
        <origin>Hancher, M.</origin>
        <origin>Dixon, M.</origin>
        <origin>Ilyshechenko, S.</origin>
        <origin>Thau, D.</origin>
        <origin>Moore, R.</origin>
        <pubdate>2017</pubdate>
        <title>Google Earth Engine: Planetary-scale geospatial analysis for everyone.</title>
        <othercit>Gorelick, N., Hancher, M., Dixon, M., Ilyushchenko, S., Thau, D., and Moore, R., 2017, Google Earth Engine: Planetary-scale geospatial analysis for everyone: Remote Sensing of Environment, v. 202, p. 18-27.</othercit>
        <onlink>https://doi.org/10.1016/j.rse.2017.06.031</onlink>
      </citeinfo>
    </crossref>
  </idinfo>
  <dataqual>
    <logic>Data have undergone QA/QC and fall within expected/reasonable ranges.</logic>
    <complete>Data set is considered complete for the information presented. Users are advised to read the rest of the metadata record carefully for additional details.</complete>
    <lineage>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>U.S. Geological Survey</origin>
            <pubdate>2025</pubdate>
            <title>Landsat imagery (from Landsat 8-9)</title>
            <geoform>PNG image</geoform>
            <pubinfo>
              <pubplace>online</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>https://doi.org/10.5066/P9OGBGM6</onlink>
          </citeinfo>
        </srccite>
        <typesrc>online database</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>19840101</begdate>
              <enddate>20241231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>collection years of satellite imagery</srccurr>
        </srctime>
        <srccitea>Landsat imagery</srccitea>
        <srccontr>The archive of Landsat 7 satellite imagery was accessed through Google Earth Engine.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>U.S. Geological Survey</origin>
            <pubdate>2025</pubdate>
            <title>Landsat imagery (from Landsat 7)</title>
            <geoform>PNG image</geoform>
            <pubinfo>
              <pubplace>online</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>https://doi.org/10.5066/P9C7I13B.</onlink>
          </citeinfo>
        </srccite>
        <typesrc>online database</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>19840101</begdate>
              <enddate>20241231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>collection years of satellite imagery</srccurr>
        </srctime>
        <srccitea>Landsat imagery</srccitea>
        <srccontr>The archive of Landsat 7 satellite imagery was accessed through Google Earth Engine.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>U.S. Geological Survey</origin>
            <pubdate>2025</pubdate>
            <title>Landsat imagery (from Landsat 5)</title>
            <geoform>PNG image</geoform>
            <pubinfo>
              <pubplace>online</pubplace>
              <publish>U.S. Geological Survey</publish>
            </pubinfo>
            <onlink>https://doi.org/10.5066/P9IAXOVV</onlink>
          </citeinfo>
        </srccite>
        <typesrc>online database</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>19840101</begdate>
              <enddate>20241231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>collection years of satellite imagery</srccurr>
        </srctime>
        <srccitea>Landsat imagery</srccitea>
        <srccontr>The archive of Landsat 5 satellite imagery was accessed through Google Earth Engine.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>Copernicus, a program of the European Union</origin>
            <pubdate>2025</pubdate>
            <title>Sentinel-2 imagery</title>
            <geoform>PNG image</geoform>
            <pubinfo>
              <pubplace>online</pubplace>
              <publish>Copernicus, a program of the European Union</publish>
            </pubinfo>
            <onlink>https://dataspace.copernicus.eu/explore-data/data-collections/sentinel-data/sentinel-2</onlink>
          </citeinfo>
        </srccite>
        <typesrc>online database</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>19840101</begdate>
              <enddate>20241231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>collection years of satellite imagery</srccurr>
        </srctime>
        <srccitea>Sentinel-2A imagery</srccitea>
        <srccontr>The archive of Sentinel 2A satellite imagery was accessed through Google Earth Engine.</srccontr>
      </srcinfo>
      <srcinfo>
        <srccite>
          <citeinfo>
            <origin>Planet Labs PBC</origin>
            <pubdate>2025</pubdate>
            <title>PlanetScope imagery</title>
            <geoform>PNG image</geoform>
            <pubinfo>
              <pubplace>online</pubplace>
              <publish>Planet Labs PBC</publish>
            </pubinfo>
          </citeinfo>
        </srccite>
        <typesrc>online database</typesrc>
        <srctime>
          <timeinfo>
            <rngdates>
              <begdate>20150101</begdate>
              <enddate>20241231</enddate>
            </rngdates>
          </timeinfo>
          <srccurr>collection years of satellite imagery</srccurr>
        </srctime>
        <srccitea>PlanetScope imagery</srccitea>
        <srccontr>The archive of PlanetScope satellite imagery was accessed through the PlanetScope Application Programming Interface.</srccontr>
      </srcinfo>
      <procstep>
        <procdesc>Set up CoastSeg toolbox (Fitzpatrick and others, 2024) for implementation along the region of interest. Toolbox set up in python 3.10 to run for geography spanning coastline for numerous worldwide locations (see map), for the time period of 01 March 1984 to 31 December 2024. Images were then manually classified as either suitable or unsuitable for analysis.</procdesc>
        <procdate>20240701</procdate>
      </procstep>
      <procstep>
        <procdesc>Ran CoastSeg toolbox on Landsat imagery available through Google Earth Engine (Gorelick and others, 2017) for geography and time period of interest. Imagery had horizontal resolution of between 4 and 30 m depending on source. Imagery with an original horizontal resolution of 30m was pan-sharpened to 15 m. The geospatial information has been removed; it is not necessary for the intended purpose of training Machine Learning models to discriminate among suitable and unsuitable imagery. Only the red, blue, and green channels have been extracted from the original multispectral imagery and, after pansharpening, these three bands are saved as a PNG format image. Then, an image segmentation model was made following the methods of Buscombe and Goldstein (2022) for the purposes of semantic image segmentation. The classes used to label imagery were: a) water, b) whitewater (surf), c) sediment, d) other. Once trained, the model was applied to folders of images, and a false-color image in PNG format was output by the model for each input image. These images were then sorted into good and bad subsets, and then train and test splits were made of the data. These data were used to train a Machine Learning model to recognize the quality of an image segmentation.</procdesc>
        <srcused>Sentinel-2A imagery</srcused>
        <srcused>Landsat imagery</srcused>
        <srcused>PlanetScope imagery</srcused>
        <procdate>20240801</procdate>
      </procstep>
      <procstep>
        <procdesc>Checked output to ensure quality results.</procdesc>
        <procdate>20240801</procdate>
      </procstep>
      <procstep>
        <procdesc>Organized image data into folders of suitable and unsuitable images. Originating imagery dates/times are included in files. No positions are encoded in the files because these data are intended solely to train a Machine Learning model to identify imagery suitable for shoreline analysis (such as imagery in which the shoreline is visible to the human eye). The imagery are organized into two folders; those for training and those for testing a Machine Learning model.</procdesc>
        <procdate>20240801</procdate>
      </procstep>
    </lineage>
  </dataqual>
  <spdoinfo>
    <indspref>Data were generated within a numerical model scheme. The model training data presented are not for a particular geographic area.</indspref>
  </spdoinfo>
  <eainfo>
    <overview>
      <eaover>This dataset contains classified satellite imagery for the purposes of training a Machine Learning model for the task of determining if the imagery is suitable for satellite shoreline analysis. These images have been manually classified.
There is one zipped folder for images used for training a Machine Learning model, called ‘train’, and another zipped folder, called 'test', for images used for testing that model once trained. Inside the 'test' and 'train' zipped folders, there are two folders of JPEG images; ‘good’ (or suitable for shoreline extraction using CoastSeg or CoastSat), or ‘bad’ (or unsuitable for shoreline extraction using CoastSeg or CoastSat).
Each image name contains a string that includes the date and time, as well as the name of the sensor. PS denotes Plantscope imagery. S2 denotes Sentinel 2A imagery. L5 denotes Landsat 5. L7 denotes Landsat 7. L8 denotes Landsat 8. L9 denotes Landsat 9. Date and time of the projected data (UTC) are in yyy-mm-dd hh:MM:SS format (where yyyy is 4 digit year, mm is 2-digit month, dd is 2-digit day, hh is 2-digit hour in 24-hour format, MM is 2-digit minutes, and SS is 2-digit seconds).  For example, the 'ID_spl62_datetime06-21-24__05_32_07_2017-08-27-10-56-52_RGB_S2.jpg' is for an image from Sentinel 2A collected on 2017-08-27 at 10:56:52.
These data are ready to be ingested into a deep learning or machine learning model training pipeline, using software such as Tensorflow, Keras, or Pytorch.</eaover>
      <eadetcit>U.S. Geological Survey</eadetcit>
    </overview>
  </eainfo>
  <distinfo>
    <distrib>
      <cntinfo>
        <cntorgp>
          <cntorg>U.S. Geological Survey - CMGDS</cntorg>
        </cntorgp>
        <cntaddr>
          <addrtype>mailing and physical</addrtype>
          <address>2885 Mission Street</address>
          <city>Santa Cruz</city>
          <state>CA</state>
          <postal>95060</postal>
        </cntaddr>
        <cntvoice>831-427-4747</cntvoice>
        <cntemail>pcmsc_data@usgs.gov</cntemail>
      </cntinfo>
    </distrib>
    <resdesc>These data are available in zipped folders. There is one zipped folder for images used for training a Machine Learning model, called ‘train’, and another zipped folder for images used for testing that model once trained. Inside the test and train zipped folders, there are two folders of PNG images; ‘good’ (or suitable for shoreline extraction using CoastSeg or CoastSat), or ‘bad’ (or unsuitable for shoreline extraction using CoastSeg or CoastSat). Each image name contains a string that includes the date and time, as well as the name of the sensor. PS denotes Plantscope imagery. S2 denotes Sentinel 2A imagery. L5 denotes Landsat 5. L7 denotes Landsat 7. L8 denotes Landsat 8. L9 denotes Landsat 9.</resdesc>
    <distliab>Unless otherwise stated, all data, metadata and related materials are considered to satisfy the quality standards relative to the purpose for which the data were collected. Although these data and associated metadata have been reviewed for accuracy and completeness and approved for release by the U.S. Geological Survey (USGS), no warranty expressed or implied is made regarding the display or utility of the data on any other system or for general or scientific purposes, nor shall the act of distribution constitute any such warranty.</distliab>
    <stdorder>
      <digform>
        <digtinfo>
          <formname>PNG</formname>
          <formcont>zip files containing images</formcont>
          <filedec>WinZip or archive utility</filedec>
          <transize>586.2</transize>
        </digtinfo>
        <digtopt>
          <onlinopt>
            <computer>
              <networka>
                <networkr>https://doi.org/10.5066/P1N4VI7H</networkr>
              </networka>
            </computer>
            <accinstr>Data can be downloaded using the Network_Resource_Name link then scrolling down to the Satellite Data section.</accinstr>
          </onlinopt>
        </digtopt>
      </digform>
      <fees>None.</fees>
    </stdorder>
    <techpreq>These data can be viewed with image (picture) viewing software or numerical processing software such as python or Matlab.</techpreq>
  </distinfo>
  <metainfo>
    <metd>20250325</metd>
    <metc>
      <cntinfo>
        <cntorgp>
          <cntorg>U.S. Geological Survey, Pacific Coastal and Marine Science Center</cntorg>
          <cntper>PCMSC Science Data Coordinator</cntper>
        </cntorgp>
        <cntaddr>
          <addrtype>mailing and physical</addrtype>
          <address>2885 Mission Street</address>
          <city>Santa Cruz</city>
          <state>CA</state>
          <postal>95060</postal>
        </cntaddr>
        <cntvoice>831-427-4747</cntvoice>
        <cntemail>pcmsc_data@usgs.gov</cntemail>
      </cntinfo>
    </metc>
    <metstdn>Content Standard for Digital Geospatial Metadata</metstdn>
    <metstdv>FGDC-STD-001-1998</metstdv>
  </metainfo>
</metadata>
