pyrate.algorithms package¶
Submodules¶
pyrate.algorithms.aisparser module¶
Parses the AIS data from csv of xml files and populates the AIS database
-
pyrate.algorithms.aisparser.
get_data_source
(name)[source]¶ Guesses data source from file name.
If the name contains ‘terr’ then we guess terrestrial data, otherwise we assume satellite.
Parameters: name (str) – File name Returns: 0 if satellite, 1 if terrestrial Return type: int
-
pyrate.algorithms.aisparser.
parse_file
(fp, name, ext, baddata_logfile, cleanq, dirtyq, source=0)[source]¶ Parses a file containing AIS data, placing rows of data onto queues
Parameters: - fp (str) – Filepath of file to be parsed
- name (str) – Name of file to be parsed
- ext (str) – Extension, either ‘.csv’ or ‘.xml’
- baddata_logfile (str) – Name of the logfile
- cleanq – Queue for messages to be inserted into clean table
- dirtyq – Queue for messages to be inserted into dirty table
- source (int, optional, default=0) – 0 is satellite, 1 is terrestrial
Returns: - invalid_ctr (int) – Number of invalid rows
- clean_ctr (int) – Number of clean rows
- dirty_ctr (int) – Number of dirty rows
- time_elapsed (time) – The time elapsed since starting the parse_file procedure
-
pyrate.algorithms.aisparser.
parse_raw_row
(row)[source]¶ Parse values from row, returning a new dict with converted values
Parse values from row, returning a new dict with converted values converted into appropriate types. Throw an exception to reject row
Parameters: row (dict) – A dictionary of headers and values from the csv file Returns: converted_row – A dictionary of headers and values converted using the helper functions Return type: dict
-
pyrate.algorithms.aisparser.
readcsv
(fp)[source]¶ Returns a dictionary of the subset of columns required
Reads each line in CSV file, checks if all columns are available, and returns a dictionary of the subset of columns required (as per AIS_CSV_COLUMNS).
If row is invalid (too few columns), returns an empty dictionary.
Parameters: fp (str) – File path Yields: rowsubset (dict) – A dictionary of the subset of columns as per columns
-
pyrate.algorithms.aisparser.
run
(inp, out, dropindices=True, source=0)[source]¶ Populate the AIS_Raw database with messages from the AIS csv files
Parameters: - inp (str) – The name of the repositor(-y/-ies) as defined in the global variable INPUTS
- out (str) – The name of the repositor(-y/-ies) as defined in the global variable OUTPUTS
- dropindices (bool, optional, default=True) – Drop indexes for faster insert
- source (int, optional, default=0) – Indicates terrestrial (1) or satellite data (0)
pyrate.algorithms.imolist module¶
-
pyrate.algorithms.imolist.
create_imo_list
(aisdb)[source]¶ Create the imo list table from MMSI, IMO pairs in clean and dirty tables.
This method collects the unique MMSI, IMO pairs from a table, and the time intervals over-which they have been seen in the data. These tuples are then upserted into the imo_list table.
Removes cases where ships have clashing MMSI numbers within a time threshold.
On the clean table pairs with no IMO number are also collected to get the activity intervals of MMSI numbers. On the dirty table only messages specifying an IMO are collected.
Parameters: aisdb (postgresdb) – The database upon which to operate
pyrate.algorithms.vesselimporter module¶
Extracts a subset of clean ships into ais_extended tables
-
pyrate.algorithms.vesselimporter.
cluster_table
(aisdb, table)[source]¶ Performs a clustering of the postgresql table on the MMSI index.
This process significantly improves the runtime of extended table generation.
-
pyrate.algorithms.vesselimporter.
filter_good_ships
(aisdb)[source]¶ Generate a set of imo numbers and (mmsi, imo) validity intervals
Generate a set of imo numbers and (mmsi, imo) validity intervals for ships which are deemed to be ‘clean’. A clean ship is defined as one which:
- Has valid MMSI numbers associated with it.
- For each MMSI number, the period of time it is associated with this IMO (via message number 5) overlaps with the period the MMSI number was in use.
- For each MMSI number, its usage period does not overlap with that of any other of this ship’s MMSI numbers.
- That none of these MMSI numbers have been used by another ship (i.e. another IMO number is also associated with this MMSI)
Returns: - valid_imos – A set of valid imo numbers
- imo_mmsi_intervals – A list of (mmsi, imo, start, end) tuples, describing the validity intervals of each (mmsi, imo) pair