Profiler
- Structured Profiler can now take in duplicate columns #315
- this is an api Change to access to the data in the report, data_stats is now a list
- Categorical Profile now includes top 5 counts #299
- Add new categorical statistics: gini impurity and unalikeability #308, #320
- Unstructured Data Labeler profile now includes entity percentages #305
- Add Pearson's correlation to the Structured Profiler #281, #307, #317
- Unstructured Profiler Text vocab now outputs a top k highest vocab counts #304, #314
Runtime Changes
- Categorical Profiler keeps an internal count of categories #296
- Text in Unstructured profiler now keep a count of vocab #304
- Data Reader's `is_match function can now take in StringIO/ByteIO #292 ,#306, #326
Bug fixes
- Bug fix to make sure samples being stored by UnstructuredProfiler save #313