Display visual blocks of text with data rendered inline.
To use rows2prose, you give it two things:
- a dataframe containing values to visualize
- styled HTML text
rows2prose renders inline data visualizations into the text.
Install
pip install rows2prose
Get started
Here are some toy examples. If you’re running in a notebook, use rows2prose.notebook instead.
1. Visualize a dataset's features
import numpy as np
import pandas as pd
import rows2prose as r2p
import sklearn.datasets
df = sklearn.datasets.load_wine(as_frame=True).frame
viz = r2p.DistributionListSnapshot
html = "<strong>Properties of 3 different classes of wine</strong><br/>"
controls = []
for i, name in enumerate(df.columns):
if name != "target":
html += f"""<div style='display:inline-block;margin:10px;'>
{name.replace('_', ' ')}:<br/>
<span data-key='{name}' class='scalar-view{i}'></span>
</div>"""
# Use a different scalar view control for each visualization if you
# want different scales for each.
controls.append(
viz.scalar_view(class_name=f"scalar-view{i}", height=20)
)
output1 = r2p.static(df, html, viz(*controls, i_config_column="target"))
with open("out1.html", "w") as f:
f.write(r2p.full_html(output1))
2. Browse a time series
df = sklearn.datasets.load_linnerud(as_frame=True).frame
viz = r2p.Timeline
html = """<p><strong>Browse sklearn's toy exercise dataset:</strong><p>
<div class="time-control" style="width:340px"></div>"""
controls = [viz.time_control(class_name="time-control", prefix="Athlete")]
for i, name in enumerate(df.columns):
html += f"""<p>
{name}:
<span data-key='{name}' class='scalar-view{i}'></span>
</p>"""
controls.append(viz.positive_scalar_view(class_name=f"scalar-view{i}"))
df["id"] = np.arange(df.shape[0])
output2 = r2p.static(df, html, viz(*controls, i_timestep_column="id"))
with open("out2.html", "w") as f:
f.write(r2p.full_html(output2))
Actual use
rows2prose’s visualizations are designed for cases where you are taking snapshots of a multidimensional system (e.g. a machine learning model) and comparing them. You might be comparing them across time, across multiple configurations, or across multiple samples.
For example, see this blog post.