mydatapreprocessing.load_data.load_data_functions package¶
Helpers for data loading issues.
-
mydatapreprocessing.load_data.load_data_functions.
download_data_from_url
(url: str, ssl_verification: None | bool | str = None) → io.BytesIO[source]¶ Download data from defined url and returns io.BytesIO.
Parameters: - url (str) – Url with defined file.
- ssl_verification (None | bool | str, optional) – Same meaning as in requests library.
Raises: FileNotFoundError
– If url is not available.Returns: Converted to io.BytesIO so it can later be used for example in pandas read_x functions.
Return type: io.BytesIO
Example
>>> downloaded = download_data_from_url( ... "https://github.com/Malachov/mydatapreprocessing/blob/master/tests/test_files/csv.csv?raw=true" ... ) >>> downloaded <_io.BytesIO object at... >>> downloaded.readline() b'Column 1, Column 2...
-
mydatapreprocessing.load_data.load_data_functions.
get_file_type
(data_path: Path, request_datatype_suffix: None | str = None)[source]¶ Give file name or url with extension and return file extension.
If file extension not at end of url add it extra.
Parameters: - data_path (Path) – Defined path. It can also be URL, but it must be pathlib.Path in format.
- request_datatype_suffix (None | str) – If there is no extension in name, it can be defined via parameter. Defaults to None.
Raises: TypeError
– If extension not inferred and not defined with param.Returns: Extension lowered like for example ‘csv’.
Return type: str
-
mydatapreprocessing.load_data.load_data_functions.
return_test_data
(data: typing_extensions.Literal['test_ramp', 'test_sin', 'test_random', 'test_ecg'][test_ramp, test_sin, test_random, test_ecg]) → pandas.core.frame.DataFrame[source]¶ If want some test data, define just name and get data.
Parameters: data (Literal['test_ramp', 'test_sin', 'test_random', 'test_ecg']) – Possible test data. Most of it is generated, test_ecg is real data. Returns: Test data. Return type: pd.DataFrame Example
>>> return_test_data('test_ramp') 0 0 0 1 1 2 2 ...