Merge List of Profiles¶
This is an example of a new utils in the dataprofiler for distributed merging of profile objects. This assumes the user is providing a list of profile objects to the utils function for merging all the profiles together.
Imports¶
Let’s start by importing the necessary packages…
[ ]:
import os
import sys
import json
import pandas as pd
import tensorflow as tf
try:
sys.path.insert(0, '..')
import dataprofiler as dp
from dataprofiler.profilers.utils import merge_profile_list
except ImportError:
import dataprofiler as dp
from dataprofiler.profilers.utils import merge_profile_list
# remove extra tf loggin
tf.compat.v1.logging.set_verbosity(tf.compat.v1.logging.ERROR)
Setup the Data and Profiler¶
This section shows the basic example of the Data Profiler.
Instantiate a Pandas dataframe with dummy data
Pass the dataframe to the
Profiler
and instantiate two separate profilers in a list
[ ]:
d = {'col1': [1, 2], 'col2': [3, 4]}
df = pd.DataFrame(data=d)
list_of_profiles = [dp.Profiler(df), dp.Profiler(df)]
Take a look at the list of profiles…
[ ]:
list_of_profiles
Run Merge on List of Profiles¶
Now let’s merge the list of profiles into a single_profile
[ ]:
single_profile = merge_profile_list(list_of_profiles=list_of_profiles)
And check out the .report
on the single profile:
[ ]:
single_profile.report()