Datafier
Datafier is deprecated, use plot specific datafiers instead.
Datafier
Datafier is deprecated, use plot specific datafiers instead.
Contains data preparation modules, which includes interpolation, rank generation, color_generation. data should be in this format where time is set to index
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
pd.DataFrame
|
The data to be prepared, should be in this format where time is set to index |
required |
time_format |
str
|
Index datetime format |
required |
ip_freq |
str
|
Interpolation frequency |
required |
ip_frac |
float
|
Rank interpolation fraction (check end of docstring), by default 0.5 |
0.5
|
n_bars |
int
|
Number of bars to be visible on the plot, by default 10 or less |
10
|
palettes |
list[str]
|
List of color palettes to generate bar colors, by default ["viridis"] |
['viridis']
|
ip_frac is the percentage of NaN values to be linearly
interpolated for column ranks
Consider this example
>>> a b
>>> date
>>> 2021-11-13 1.0 4.0
>>> 2021-11-14 NaN NaN
>>> 2021-11-15 NaN NaN
>>> 2021-11-16 NaN NaN
>>> 2021-11-17 NaN NaN
>>> 2021-11-18 2.0 6.0
with ip_frac set to 0.5, 50% of NaN's will be linearly
interpolated while the rest will back filled.
>>> a b
>>> 2021-11-13 1.00 4.00 << original value --------
>>> 2021-11-14 1.33 4.67 |
>>> 2021-11-15 1.67 5.33 | 50% linearly
>>> 2021-11-16 2.00 6.00 <- linear interpolation | interpolated
>>> 2021-11-17 2.00 6.00 upto here | rest are filled.
>>> 2021-11-18 2.00 6.00 << original value---------
This adds some stability in the barChartRace
and reduces constantly shaking of bars.
add_var(row_var=None, col_var=None)
Adds additional variables to the data, both row and column wise.
Row wise data format: The index should be equal to that of the actual data.
Column wise data format: The index should be equal to the columns of the actual data.Parameters:
Name | Type | Description | Default |
---|---|---|---|
row_var |
pd.DataFrame
|
Dataframe containing variables related to time, by default None |
None
|
col_var |
pd.DataFrame
|
Dataframe containing variables related to columns, by default None |
None
|
interpolate_even(data, freq, method='linear')
Interpolates the given dataframe according to the frequency
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
pd.DataFrame
|
Dataframe contaning the data |
required |
freq |
str
|
Interpolation frequency |
required |
method |
str
|
Interpolation method, by default "linear" |
'linear'
|
Returns:
Type | Description |
---|---|
pd.DataFrame
|
Interpolated dataframe |
get_prepared_data(data, ip_frac=0.5)
Creates interpolated data and column ranks
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
pd.DataFrame
|
Dataframe containing the data |
required |
ip_frac |
float
|
Interpolation fraction, by default 0.5 |
0.5
|
Returns:
Type | Description |
---|---|
tuple[pd.DataFrame, pd.DataFrame]
|
get_top_cols()
Selects columns where column_rank < n_bars in any timestamp
Returns:
Type | Description |
---|---|
list[int]
|
List of columns that will appear in the animation at least once |
get_bar_colors()
Generates bar (column) colors based on the given color palettes
Returns:
Type | Description |
---|---|
dict[str, str]
|
dict containing column to color mapping |