Avro Data¶
- class dataprofiler.data_readers.avro_data.AVROData(input_file_path=None, data=None, options=None)¶
Bases:
dataprofiler.data_readers.json_data.JSONData
,dataprofiler.data_readers.base_data.BaseData
AVROData class to save and load spreadsheet data
Data class for loading datasets of type AVRO. Can be specified by passing in memory data or via a file path. Options pertaining the AVRO may also be specified using the options dict parameter. Possible Options:
options = dict( data_format= type: str, choices: "dataframe", "records", "avro" selected_keys= type: list(str) )
data_format: user selected format in which to return data can only be of specified types selected_keys: keys being selected from the entire dataset
- Parameters
input_file_path (str) – path to the file being loaded or None
data (multiple types) – data being loaded into the class instead of an input file
options (dict) – options pertaining to the data type
- Returns
None
- data_type = 'avro'¶
- classmethod is_match(file_path, options=None)¶
Test the given file to check if the file has valid AVRO format or not.
- Parameters
file_path (str) – path to the file to be examined
options (dict) – avro read options
- Returns
is file a avro file or not
- Return type
bool
- property data¶
- property data_and_metadata¶
Returns a data frame that joins the data and the metadata.
- property data_format¶
- property file_encoding¶
- get_batch_generator(batch_size)¶
- info = None¶
- property is_structured¶
Determines compatibility with StructuredProfiler
- property length¶
Returns the length of the dataset which is loaded.
- Returns
length of the dataset
- property metadata¶
Returns a data frame that contains the metadata
- reload(input_file_path=None, data=None, options=None)¶
Reload the data class with a new dataset. This erases all existing data/options and replaces it with the input data/options.
- Parameters
input_file_path (str) – path to the file being loaded or None
data (multiple types) – data being loaded into the class instead of an input file
options (dict) – options pertaining to the data type
- Returns
None
- property selected_keys¶