|Open-File Report 2005-1001|
|OFR 2005-1001 Home / Procedures / East-Coast Database / GIS Data Catalog|
U.S. Geological Survey
Open-File Report 2005-1001
USGS East-Coast Sediment Analysis: Procedures, Database, and GIS Data
Our objective was to gather all of the available data on grain size for the bottom sediments produced by the Woods Hole Coastal and Marine Science Center of the U.S. Geological Survey into a scientifically-edited database that will allow scientists, policy makers, and others to manipulate, query, and display the original data themselves in order to address their own specific applications. Requirements for the database formats are that they be comprehensive and simple for both entering and extracting data.
The basic structure of the database is a matrix where records are rows representing individual samples and the columns contain information on sample identification, navigation, classifications, analyzed parameters, and comments. This is a "flat file" format, which means that it is not "normalized". While this format is considered inefficient from the point of view of database management, it is the simplest way of presenting the basic data. This structure was chosen to avoid ambiguity, and to make the process of locating fields, entering data, and validating it as simple yet comprehensive as possible. Since we know neither the software capabilities of the user nor the probable uses that may be made of the data, we have made no attempt to split the files to reduce blank fields or remove redundancies. The same data may be presented in more than one form, for example phi class frequencies and cumulative frequencies. Even though each form can be derived from the other, presenting both eliminates the need for the user to program formulas to calculate one from the other. Although this may violate the principal of having a single entry for any given data item, it greatly simplifies the use of the file. If the user wishes to make the data base more efficient through "normalization", we feel that it is better that this be done by the user to fit both the applications available to the user and the database structural logic that is familiar to the user. The price paid for the "flat file" approach is additional storage space, rather wide records, and the possibility that corrections made here at the source may fail to be carried through to all forms of the data affected. We have made every effort to see that this last possibility did not happen.
The database presented here contains 58 fields (please refer to the Data Dictionary). The specific fields and parameters have been chosen based on the data produced by the Sedimentation Laboratory at the Woods Hole Coastal and Marine Science Center, and the format of information typically found in the literature. Because the data have come from numerous projects, there are differing amounts and types of information. Most of the samples or sets of samples do not have data in all of the given fields; however, additional fields, qualifiers, and data can be added in virtually unlimited fashion to accommodate specific needs.
The database itself is provided in three formats: comma-delimited ASCII text (.csv), Microsoft Excel 2010 (.xls), and Esri shapefile (.shp). The comma-delimited file contains data as well as headings for the tables of data in uncompressed ASCII format. This comma-delimited file is supplied for users who do not have a Windows compatible computer, or for users who wish to import the data into applications that can accept ASCII character information. The formatted files will open in the appropriate software if the user has the applications installed and their web browser properly configured. The database can be accessed through the Data Catalog section of this report.