Citation:
Citation_Information:
Originator: Evan T. Dailey
Publication_Date: 2017
Title: ViTexOCR; a script to extract text overlays from digital video
Geospatial_Data_Presentation_Form: Python script, PDF documentation
Series_Information:
Series_Name: software release
Issue_Identification: DOI:10.5066/F7833Q56
Publication_Information:
Publication_Place:
Pacific Coastal and Marine Science Center, Santa Cruz, California
Publisher: U.S. Geological Survey
Online_Linkage: https://doi.org/10.5066/F7833Q56
Online_Linkage:
Description:
Abstract:
The ViTexOCR script presents a new method for extracting navigation data from videos with text overlays using optical character recognition (OCR) software. Over the past few decades, it was common for videos recorded during surveys to be overlaid with real-time geographic positioning satellite chyrons including latitude, longitude, date and time, as well as other ancillary data (such as speed, heading, or user input identifying fields). Embedding these data into videos provides them with utility and accuracy, but using the location data for other purposes, such as analysis in a geographic information system, is not possible when only available on the video display. Extracting the text data from imagery using software allows these videos to be located and analyzed in a geospatial context.
The script allows a user to select a video, specify the text data types (e.g. latitude, longitude, date, time, or other), text color, and the pixel locations of overlay text data on a sample video frame. The script’s output is a data file containing the retrieved geospatial and temporal data. All functionality is bundled in a Python script that incorporates a graphical user interface and several other software dependencies.
Purpose:
The ViTexOCR script was developed to geospatially locate videos, primarily for the purpose of including videos collected through the USGS Coastal and Marine Geology Program in the USGS Video and Photograph Portal.
Time_Period_of_Content:
Time_Period_Information:
Single_Date/Time:
Calendar_Date: 2017
Currentness_Reference: publication date
Status:
Progress: Complete
Maintenance_and_Update_Frequency: As needed
Spatial_Domain:
Bounding_Coordinates:
West_Bounding_Coordinate: -180.0
East_Bounding_Coordinate: 180.0
North_Bounding_Coordinate: 90.0
South_Bounding_Coordinate: -90.0
Keywords:
Theme:
Theme_Keyword_Thesaurus: USGS Metadata Identifier
Theme_Keyword: USGS:58dd56ace4b02ff32c685954
Theme:
Theme_Keyword_Thesaurus: Marine Realms Information Bank (MRIB) keywords
Theme_Keyword: computer science
Theme:
Theme_Keyword_Thesaurus: USGS Thesaurus
Theme_Keyword: scientific software
Theme_Keyword: software development
Theme:
Theme_Keyword_Thesaurus: None
Theme_Keyword: U.S. Geological Survey
Theme_Keyword: USGS
Theme_Keyword: Coastal and Marine Geology Program
Theme_Keyword: CMGP
Theme_Keyword: Pacific Coastal and Marine Science Center
Theme_Keyword: PCMSC
Access_Constraints: none
Use_Constraints:
USGS-authored or produced data and information are in the public domain from the U.S. Government and are freely redistributable with proper metadata and source attribution. Please recognize and acknowledge the U.S. Geological Survey as the originator(s) of this script.
Point_of_Contact:
Contact_Information:
Contact_Person_Primary:
Contact_Person: Evan T. Dailey
Contact_Organization:
U.S. Geological Survey, Pacific Coastal and Marine Science Center
Contact_Address:
Address_Type: mailing and physical
Address: 2885 Mission Street
City: Santa Cruz
State_or_Province: CA
Postal_Code: 95060
Country: United States
Contact_Voice_Telephone: 831-460-7591
Contact_Electronic_Mail_Address: edailey@usgs.gov
Browse_Graphic:
Browse_Graphic_File_Name:
Browse_Graphic_File_Description:
Top: example of image from digital video with portions of text overlays outlined; Bottom: example of extracted results of navigation data from digital video displayed on a map
Browse_Graphic_File_Type: PNG
Native_Data_Set_Environment:
The python script, filename ViTexOCR.py, was developed using Python version 2.7.11 on Mac OS X version 10.10.4 and Windows 7 64-bit. The python script file is 34 kb.