Template Customization Guide¶
DataComPy allows you to customize the report output by providing your own Jinja2 templates. This guide explains how to create and use custom templates for comparison reports.
Template Basics¶
Custom templates are Jinja2 templates that receive comparison data and format it according to your needs. The template receives a context dictionary with various comparison metrics and data samples.
Available Template Variables¶
The following variables are available in the template context:
Variable
Description
column_summary
Dict with column statistics including:
common_columns
: List of columns common to both dataframes
df1_unique
: List of columns unique to df1
df2_unique
: List of columns unique to df2
df1_name
: Name of the first dataframe
df2_name
: Name of the second dataframe
row_summary
Dict with row statistics including:
match_columns
: List of columns used for matching
abs_tol
: Absolute tolerance between two values
rel_tol
: Relative tolerance between two values
common_rows
: Number of rows in common
df1_unique
: Number of rows unique to df1
df2_unique
: Number of rows unique to df2
unequal_rows
: Number of rows with differences
df1_name
: Name of the first dataframe
df2_name
: Name of the second dataframe
column_comparison
Dict with column comparison stats:
unequal_columns
: List of columns with mismatches
equal_columns
: List of columns that match exactly
unequal_values
: Count of unequal values across all columns
mismatch_stats
Dict containing:
stats
: List of dicts with per-column mismatch statistics: -column
: Column name -match
: Number of matching values -mismatch
: Number of mismatched values -null_diff
: Number of null value differences -total
: Total number of comparisons
samples
: Sample rows with mismatched values
has_samples
: Boolean indicating if there are any samples
has_mismatches
: Boolean indicating if there are any mismatches
df1_unique_rows
Dict with unique rows in df1:
has_rows
: Boolean indicating if there are unique rows
rows
: Sample of unique rows (as strings)
columns
: List of column names
df2_unique_rows
Dict with unique rows in df2:
has_rows
: Boolean indicating if there are unique rows
rows
: Sample of unique rows (as strings)
columns
: List of column names
df1_shape
,df2_shape
Tuples of (rows, columns) for each dataframe
df1_name
,df2_name
Names of the dataframes being compared
Creating a Custom Template¶
To create a custom template:
Create a new file with a
.j2
extensionUse Jinja2 syntax to format the output
Reference the available template variables as needed
Example Template¶
Here’s a simple example template that shows basic comparison metrics:
# Data Comparison Report
=====================
## DataFrames
- {{ df1_name }}: {{ df1_shape[0] }} rows × {{ df1_shape[1] }} columns
- {{ df2_name }}: {{ df2_shape[0] }} rows × {{ df2_shape[1] }} columns
## Column Summary
- Common columns: {{ column_summary.common_columns }}
- Columns only in {{ df1_name }}: {{ column_summary.df1_unique }}
- Columns only in {{ df2_name }}: {{ column_summary.df2_unique }}
## Row Summary
- Rows with some columns unequal: {{ row_summary.unequal_rows }}
- Rows with all columns equal: {{ row_summary.equal_rows }}
{% if mismatch_stats %}
## Mismatched Columns
{% for col in mismatch_stats %}
- {{ col.column }}: {{ col.unequal_cnt }} mismatches
- Match rate: {{ "%.2f"|format(col.match_rate * 100) }}%
{% endfor %}
{% endif %}
Using a Custom Template¶
To use your custom template, pass its path to the report()
method:
from datacompy import Compare
compare = Compare(df1, df2, join_columns=['id'])
# Generate report with custom template
report = compare.report(template_path='path/to/your/template.j2')
print(report)
Template Path Resolution¶
The template path can be:
An absolute path to a template file
A path relative to the current working directory
A filename in the default templates directory (
datacompy/templates/
)
Jinja2 Template Features¶
You can use all standard Jinja2 features in your templates, including:
Control structures (
{% if %}
,{% for %}
, etc.)Filters (
{{ value|upper }}
,{{ value|default('N/A') }}
, etc.)Macros for reusable components
Template inheritance
For more information on Jinja2 templating, see the Jinja2 documentation.
Default Template Reference¶
The default template used by DataComPy is available in the source code at datacompy/templates/report_template.j2
.
You can use this as a reference when creating your own templates.