mydatapreprocessing.consolidation.consolidation_config package¶
Consolidation config and subconfig classes.
-
mydatapreprocessing.consolidation.consolidation_config.
default_consolidation_config
¶ Default config, that you can use. You can use intellisense with help tooltip to see what you can setup there or you can use update method for bulk configuration.
Type: mydatapreprocessing.consolidation.consolidation_config.ConsolidationConfig
-
class
mydatapreprocessing.consolidation.consolidation_config.
ConsolidationConfig
(frozen=None, *a, **kw)[source]¶ Bases:
mypythontools.config.config_internal.Config
Config class for consolidate_data pipeline.
There is default_consolidation_config object already created. You can import it, edit and use. Static type check and intellisense should work.
-
check_shape_and_transform
¶ Check whether correct shape is used and eventually transpose.
Type
bool
Default
True
Usually there is much more rows than columns in table. If not, it can mean that dimensions are swapped from data load. This will check this, transform if necessary and log it.
-
data_length
¶ Limit the data length after resampling.
Type
int
Default
0
If 0, then all the data is used.
-
datetime
= None¶ Set datetime index and convert it to datetime type.
-
dtype
¶ Set output dtype.
Type
str | np.dtype | pd.Series | list[str | np.dtype]
Default
“float32”
For possible inputs check pandas function astype.
-
first_column
¶ Move defined column on index 0.
Type
None | PandasIndex
Default
None
-
inplace
¶ Define whether work on inserted data itself, or on a copy.
Type
bool
Default
False
Copy is created just once, then internally all the consolidating functions are used inplace. Syntax is a bit different than in for example Pandas. Use assigning to variable e.g. df = consolidate_data(df) even with inplace. If True your inserted data will be changed.
-
remove_missing_values
= None¶ Define whether and how to remove NotANumber values.
-
resample
= None¶ Change sampling frequency on defined frequency if there is a datetime column. You can use sum or average.
-
strings_to_numeric
= None¶ Remove or replace string values with numbers.
-