Runtime Changes
Notes
This update reduces runtime by on average 50%.
Profiler
- Add support for HistogramOptions
- Add multiprocessing support
- Reduced runtime for shuffling indices
- Vectorized precision function
- Improved unique set & vocab merging
- By default histogram only runs 'auto' bin edge detection
Data
- Add length attribute to the data class
data.length()orlen(data)
Report
- Added optional
omit_keysto the report options function, remove keys from the final report - Added
row_has_null_count(global), one or more nulls in the row - Added
row_is_null_count(global), the entire row is null - Rename
total_samples(global) ->row_count - Rename label
BACKGROUND->UNKNOWN(column) - Removed
covariance(global) - Removed
data_classification(global) - Removed
data_label_probability(column) - Removed
median(column)
Bug fixes
- Accurate null count and total_samples on profile updates
- Each column now receives the same sampled indices; enabling
row_is_null_count