JSON Data¶
-
class
dataprofiler.data_readers.json_data.
JSONData
(input_file_path=None, data=None, options=None)¶ Bases:
dataprofiler.data_readers.structured_mixins.SpreadSheetDataMixin
,dataprofiler.data_readers.base_data.BaseData
SpreadsheetData class to save and load spreadsheet data
Data class for loading datasets of type JSON. Can be specified by passing in memory data or via a file path. Options pertaining the JSON may also be specified using the options dict parameter. Possible Options:
options = dict( data_format= type: str, choices: "dataframe", "records", "json", "flattened_dataframe" selected_keys= type: list(str) payload_keys= type: Union[str, list(str)] )
data_format: user selected format in which to return data can only be of specified types selected_keys: keys being selected from the entire dataset payload_keys: list of dictionary keys that determine the payload
- Parameters
input_file_path (str) – path to the file being loaded or None
data (multiple types) – data being loaded into the class instead of an input file
options (dict) – options pertaining to the data type
- Returns
None
-
data_type
= 'json'¶
-
property
selected_keys
¶
-
property
metadata
¶ Returns a data frame that contains the metadata
-
property
data_and_metadata
¶ Returns a data frame that joins the data and the metadata.
-
property
is_structured
¶ Determines compatibility with StructuredProfiler
-
classmethod
is_match
(file_path, options=None)¶ Test the first 1000 lines of a given file to check if the file has valid JSON format or not. At least 60 percent of the lines in the first 1000 lines have to be valid json.
- Parameters
file_path (str) – path to the file to be examined
options (dict) – json read options
- Returns
is file a json file or not
- Return type
bool
-
property
data
¶
-
property
data_format
¶
-
property
file_encoding
¶
-
get_batch_generator
(batch_size)¶
-
info
= None¶
-
property
length
¶ Returns the length of the dataset which is loaded.
- Returns
length of the dataset
-
reload
(input_file_path=None, data=None, options=None)¶ Reload the data class with a new dataset. This erases all existing data/options and replaces it with the input data/options.
- Parameters
input_file_path (str) – path to the file being loaded or None
data (multiple types) – data being loaded into the class instead of an input file
options (dict) – options pertaining to the data type
- Returns
None