mydatapreprocessing.load_data.load_data_functions.data_parsers package¶
If there are some formats, that need extra logic, it’s put aside here.
-
mydatapreprocessing.load_data.load_data_functions.data_parsers.
csv_load
(data: io.BytesIO | str | Path, csv_style: Literal['infer'] | dict = 'infer', header: Literal['infer'] | None | int = None, max_imported_length: None | int = None) → pd.DataFrame[source]¶ Load CSV data and infer used separator.
Parameters: - data (io.BytesIO | str | Path) – Input data.
- csv_style (Literal["infer"] | dict, optional) – If infer, inferred automatically else dictionary with sep and decimal. E.g. {‘sep’: ‘;’, ‘dec’: ‘,’}. Defaults to “infer”.
- header (Literal['infer'] | None | int, optional) – First row used. Usually with column names. Defaults to None.
- max_imported_length (int, optional) – Last N rows used. Defaults to None.
Raises: RuntimeError
– If loading fails.Returns: Loaded data.
Return type: pd.DataFrame
-
mydatapreprocessing.load_data.load_data_functions.data_parsers.
json_load
(data: str | Path | io.BytesIO, field: str, data_orientation: Literal[('index', 'columns')] = 'columns')[source]¶ Load data from json to DataFrame.
The reason why pandas read_json is not used is that usually just some subfield with inner json is used.
Parameters: - data (str | Path | io.BytesIO) – Input data. Path to file or io.BytesIO created for example from request content.
- field (str, optional) – If you need to use just a node from data. You can use dot for entering another levels of nested data. For example “key_1.sub_key_1”
- data_orientation (Literal["index", "columns"], optional) – Define dict data orientation. Defaults to “columns”.
Raises: KeyError
– If defined key is not available.Returns: Loaded data.
Return type: pd.DataFrame
-
mydatapreprocessing.load_data.load_data_functions.data_parsers.
load_dict
(data: dict[str, Any], data_orientation: Literal[('index', 'columns')] = 'columns')[source]¶ Load dict with values to DataFrame.
Parameters: - data (dict[str, Any]) – Data with array like values.
- data_orientation (Literal["index", "columns"], optional) – Define dict data orientation. Defaults to “columns”.
Returns: Loaded data.
Return type: pd.DataFrame