PrettyPandas¶
PrettyPandas is an extension to the Pandas DataFrame class that helps you create report qualitiy tables with a simple API.
(
df
.pipe(PrettyPandas)
.as_currency('GBP', subset='A')
.as_percent(subset='B')
.total()
.average()
)
Features¶
- Add summary rows and columns.
- Number formatting for currency, scientific units, and percentages.
- Chaining commands.
- Works seamlessly with Pandas Style API.
Note
Version 0.0.4 removes the apply_pretty_globals
function and other custom
CSS properties because Pandas and Jupyter now defaults to providing great
looking html tables. If you still want custom CSS you can use the Pandas
Style API.
Installation¶
You can install PrettyPandas using pip
with support for Python 2.7, 3.3,
3.4, and 3.5:
pip install prettypandas
You can also install from source:
git clone git@github.com:HHammond/PrettyPandas.git
cd PrettyPandas
python setup.py install
Contributing¶
The project is available on GitHub and anyone is welcome to contribute. You can use the issue tracker to report issues, bugs, or suggest improvements.
Contents¶
Quick Start¶
Adding Summaries¶
PrettyPandas supports many built in summary functions, as well as providing the ability to create your own summaries. Summary functions can be applied over a DataFrame’s rows or columns, or both.
The builtin summary methods are:
If you wanted to add a grand total to the bottom of your table the code is simple:
PrettyPandas(df).total()
Or additionally if you want to use Pandas fluent API:
df.pipe(PrettyPandas).total()
PrettyPandas follows a fluent API so you can chain multiple summaries easily:
df.pipe(PrettyPandas).total().average()
The axis
parameter specifies which numpy
style axis to apply a summary
on — 0 for columns, 1 for rows, and None
for both.
PrettyPandas(df).total(axis=1)
You can even mix and match summaries applied to different axis.
Converting Back to Pandas DataFrame¶
.to_frame()
¶
After adding summary rows or columns you can get a DataFrame with your changes
applied by calling the ._to_frame
.
For example the following code would add a total to your DataFrame and return it back to a Pandas native DataFrame.
(
df
.pipe(PrettyPandas)
.total(axis=1)
.to_frame()
)
.style
¶
The .style
property allows you to drop right into the Pandas Style API.
This code would allow you to compute a summary, format the table using
percentages, and apply a backgrouned gradient to a table:
(
df.pipe(PrettyPandas)
.as_percent(precision=0)
.median()
.style
.background_gradient()
)
Formatting Numbers¶
Most reports use at least some units of measurement. PrettyPandas currently supports percentages, money, and a more general unit method.
The as_unit
method takes a positional unit
argument which indicates the
string representing the unit to be used and a location
argument to specify
whether the unit should be a prefix or suffix to the value.
The as_currency
and as_percent
methods are localized to use whatever
units your Python distribution thinks are best for you. If you aren’t getting
the correct units use the set_locale
method to specify your locale.
If you need to use a different currency, just pass it to currency='...'
to
change it.
The as_money
method takes optional currency
and location
arguments
which work just like the as_unit
method. By default the currency is in
dollars.
Note
Python 2 doesn’t support unicode literals by default. You can use unicode
literals (e.g. u'€'
) or import the unicode literal behaviour from
Python 3:
from __future__ import unicode_literals
Formatting Columns¶
By default the formatting methods apply to the entire dataframe. When you need to format just a few columns you can use the subset argument to specify a single column, or multiple columns.
PrettyPandas(df).as_percent(subset='A') # Format just column A
PrettyPandas(df).as_percent(subset=['A', 'B']) # Format columns A and B
Formatting Rows and Complex Formatting¶
Formatting rows is more complicated than formatting columns. The subset argument needs to take in a pandas.Index to specify the row.
# Format the row with row-index 3
PrettyPandas(df).as_percent(subset=pd.IndexSlice[3,:], precision=2)
For multi-index dataframes subsetting is more complicated. You will need to use
multiple pandas.IndexSlice
objects to get the correct rows.
The following example shows how to select rows in a multi-index:
first_row_idx = pd.IndexSlice[0, :]
second_row_idx = pd.IndexSlice[1, :]
(
df.pipe(PrettyPandas)
.as_currency(subset=first_row_idx)
.as_percent(subset=second_row_idx)
.total(axis=1)
)
For more info on Pandas indexing, read Pandas Indexing and Pandas Advanced Indexing.
Testing¶
Tests use pytest for testing. After downloading the repository from GitHub run the following:
py.test test
prettypandas package¶
-
class
prettypandas.
PrettyPandas
(data, summary_rows=None, summary_cols=None, formatters=None, *args, **kwargs)¶ Bases:
object
Parameters: - data – DataFrame.
- summary_rows – list of Aggregate objects to be appended as a summary.
- summary_cols – list of Aggregate objects to be appended as a summary.
- formatters – List of Formatter objects to format.
-
as_currency
(currency=u'USD', locale=Locale('en_US'), *args, **kwargs)¶ Format subset as currency
Parameters: - currency – Currency
- locale – Babel locale for currency formatting
- subset – Pandas subset
-
as_percent
(precision=2, *args, **kwargs)¶ Format subset as percentages
Parameters: - precision – Decimal precision
- subset – Pandas subset
-
as_unit
(unit, location=u'suffix', *args, **kwargs)¶ Format subset as with units
Parameters: - unit – string to use as unit
- location – prefix or suffix
- subset – Pandas subset
-
average
(title=u'Average', **kwargs)¶ Add a mean summary to this table.
Parameters: title – Title to be displayed.
-
frame
¶ Add summaries and convert back to DataFrame
-
max
(title=u'Maximum', **kwargs)¶ Add a maximum summary to this table.
Parameters: title – Title to be displayed.
-
median
(title=u'Median', **kwargs)¶ Add a median summary to this table.
Parameters: title – Title to be displayed.
-
min
(title=u'Minimum', **kwargs)¶ Add a minimum summary to this table.
Parameters: title – Title to be displayed.
-
multi_summary
(funcs, titles, axis=0, *args, **kwargs)¶
-
render
()¶
-
style
¶ Add summaries and convert to Pandas Styler
-
summary
(func=<operator.methodcaller object>, title=u'Total', axis=0, subset=None, *args, **kwargs)¶ Add multiple summary rows or columns to the dataframe.
Parameters: - func – function to be used for a summary.
- titles – Title for this summary column.
- axis – Same as numpy and pandas axis argument. A value of None will cause the summary to be applied to both rows and columns.
- args – Positional arguments passed to all the functions.
- kwargs – Keyword arguments passed to all the functions.
The results of summary can be chained together.
-
to_frame
()¶ Add summaries and convert back to DataFrame
-
total
(title=u'Total', **kwargs)¶ Add a total summary to this table.
Parameters: title – Title to be displayed.
-
prettypandas.
as_currency
(currency='USD', locale=Locale('en_US'))¶
-
prettypandas.
as_percent
(precision=2, **kwargs)¶ Convert number to percentage string.
Parameters: - v – numerical value to be converted
- precision – int decimal places to round to
-
prettypandas.
as_unit
(unit, precision=2, location='suffix')¶ Convert value to unit.
Parameters: - v – numerical value
- unit – string of unit
- precision – int decimal places to round to
- location – ‘prefix’ or ‘suffix’ representing where the currency symbol falls relative to the value
Submodules¶
prettypandas.summarize module¶
-
class
prettypandas.summarizer.
Aggregate
(title, func, subset=None, axis=0, *args, **kwargs)¶ Bases:
object
Aggreagte
Wrapper to calculate aggregate row on datafame.
Parameters: - title – Aggregate row title
- func – Function to be passed to DataFrame.agg
- subset – Subset of DataFrame to compute aggregate on
- axis – Pandas axis to compute over
- args – Positionsal arguments to DataFrame.agg
- kwargs – Keyword arguments to DataFrame.agg
-
apply
(df)¶ Compute aggregate over DataFrame
-
class
prettypandas.summarizer.
Formatter
(formatter, args, kwargs)¶ Bases:
object
Wrapper to apply formatting to datafame.
Parameters: - formatter – Function to be passed to Pandas Styler.format
- args – Positionsal arguments to Styler.format
- kwargs – Keyword arguments to Styler.format
-
apply
(styler)¶ Apply Summary over Pandas Styler
-
class
prettypandas.summarizer.
PrettyPandas
(data, summary_rows=None, summary_cols=None, formatters=None, *args, **kwargs)¶ Bases:
object
Parameters: - data – DataFrame.
- summary_rows – list of Aggregate objects to be appended as a summary.
- summary_cols – list of Aggregate objects to be appended as a summary.
- formatters – List of Formatter objects to format.
-
as_currency
(currency=u'USD', locale=Locale('en_US'), *args, **kwargs)¶ Format subset as currency
Parameters: - currency – Currency
- locale – Babel locale for currency formatting
- subset – Pandas subset
-
as_percent
(precision=2, *args, **kwargs)¶ Format subset as percentages
Parameters: - precision – Decimal precision
- subset – Pandas subset
-
as_unit
(unit, location=u'suffix', *args, **kwargs)¶ Format subset as with units
Parameters: - unit – string to use as unit
- location – prefix or suffix
- subset – Pandas subset
-
average
(title=u'Average', **kwargs)¶ Add a mean summary to this table.
Parameters: title – Title to be displayed.
-
frame
¶ Add summaries and convert back to DataFrame
-
max
(title=u'Maximum', **kwargs)¶ Add a maximum summary to this table.
Parameters: title – Title to be displayed.
-
median
(title=u'Median', **kwargs)¶ Add a median summary to this table.
Parameters: title – Title to be displayed.
-
min
(title=u'Minimum', **kwargs)¶ Add a minimum summary to this table.
Parameters: title – Title to be displayed.
-
multi_summary
(funcs, titles, axis=0, *args, **kwargs)¶
-
render
()¶
-
style
¶ Add summaries and convert to Pandas Styler
-
summary
(func=<operator.methodcaller object>, title=u'Total', axis=0, subset=None, *args, **kwargs)¶ Add multiple summary rows or columns to the dataframe.
Parameters: - func – function to be used for a summary.
- titles – Title for this summary column.
- axis – Same as numpy and pandas axis argument. A value of None will cause the summary to be applied to both rows and columns.
- args – Positional arguments passed to all the functions.
- kwargs – Keyword arguments passed to all the functions.
The results of summary can be chained together.
-
to_frame
()¶ Add summaries and convert back to DataFrame
-
total
(title=u'Total', **kwargs)¶ Add a total summary to this table.
Parameters: title – Title to be displayed.
prettypandas.formatters module¶
-
prettypandas.formatters.
as_currency
(currency='USD', locale=Locale('en_US'))¶
-
prettypandas.formatters.
as_percent
(precision=2, **kwargs)¶ Convert number to percentage string.
Parameters: - v – numerical value to be converted
- precision – int decimal places to round to
-
prettypandas.formatters.
as_unit
(unit, precision=2, location='suffix')¶ Convert value to unit.
Parameters: - v – numerical value
- unit – string of unit
- precision – int decimal places to round to
- location – ‘prefix’ or ‘suffix’ representing where the currency symbol falls relative to the value