hip_data_tools package

Submodules

hip_data_tools.common module

Module contains variables and methods used for common / shared operations throughput the package

class hip_data_tools.common.DictKeyValueSource(data)

Bases: hip_data_tools.common.KeyValueSource

class for sourcing secrets from a provided Dict object, usually used for testing

exists(key)

verify if a key exists

Parameters:key (str) – the key to be verified for existance

Returns: bool

get(key)

get the value for a given key

Parameters:key (str) – the key for which the value needs to be returned

Returns: str

hip_data_tools.common.ENVIRONMENT = <hip_data_tools.common.EnvironmentKeyValueSource object>

Standard Environment Variable Secret source to be reused across the project

class hip_data_tools.common.EnvironmentKeyValueSource

Bases: hip_data_tools.common.KeyValueSource

class for sourcing secrets from env variables

exists(key)

verify if a key exists

Parameters:key (str) – the key to be verified for existance

Returns: bool

get(key)

get the value for a given key

Parameters:key (str) – the key for which the value needs to be returned

Returns: str

class hip_data_tools.common.KeyValueSource

Bases: abc.ABC

Abstract class for sourcing secrets, it is a key value source for retrieving values for keys

exists(key)

Abstract methosd to verify if a key exists in the given data store :param key: the key to be verified for existance :type key: str

Returns: bool

get(key: str) → str

Abstract method to get the value for a given key

Parameters:key (str) – the key for which the value needs to be returned

Returns: str

hip_data_tools.common.LOG = <Logger hip_data_tools.common (WARNING)>

logger object to handle logging in the entire package

class hip_data_tools.common.SecretsManager(required_keys: list, source: hip_data_tools.common.KeyValueSource)

Bases: abc.ABC

A secret management abstract class that provides ways of extracting secrets The class allows a subsequent connection class to use env vars to extract secrets in a structured manner

Parameters:
  • required_keys (list[str]) – a list of keys which will be checked for existence
  • source (KeyValueSource) – a kv source that has secrets
get_secret(key)

get the secret valye for a given key

Parameters:key (str) – the key for given secret

Returns: str

hip_data_tools.common.camel_case_detect = re.compile('(?<!^)(?=[A-Z])')

Regex pattern to be reused for Camel Case

hip_data_tools.common.dataframe_columns_to_snake_case(data: pandas.core.frame.DataFrame) → None
hip_data_tools.common.flatten_nested_dict(data: dict, delimiter: str = '_', snake_cased_keys: bool = True) → dict

takes arbitrarily nested levels of a dictionary and un nests it to one level of key value pairs

Parameters:
  • data (dict) – the dictionary ro be un nested
  • delimiter (str) – the delimiter to concatenate nested keys
  • snake_cased_keys (bool) – convert the keys in resulting dict to snake case

Returns: dict

hip_data_tools.common.get_from_env_or_default_with_warning(env_var, default_val)

Get environmental variables or, if they aren’t present, default to a specific value

Parameters:
  • env_var (str) – Name of the environmental variable to read
  • default_val (any) – Value to default to if relevant env var is not present

Returns (Any): Value

hip_data_tools.common.nested_list_of_dict_to_dataframe(data: List[dict]) → pandas.core.frame.DataFrame
hip_data_tools.common.special_characters_detect = re.compile('[^a-zA-Z0-9]')

Regex pattern to detect special characters

hip_data_tools.common.to_snake_case(column_name: str) → str

Converts the column name to Athena compatible snake_case

Parameters:column_name (str) – column name string to be sanitized

Returns: str

hip_data_tools.common.validate_and_fix_common_integer_fields(df: pandas.core.frame.DataFrame)

Module contents