pandera.core.pandas.container.DataFrameSchema.reset_index#

DataFrameSchema.reset_index(level=None, drop=False)[source]#

A method for resetting the Index of a DataFrameSchema

Parameters

level (Optional[List[str]]) – list of labels
drop (bool) – bool, default True

Return type

ForwardRef

Returns

a new DataFrameSchema with specified column(s) in the index.

Raises

SchemaInitError if no index set in schema.

Examples

Similar to the pandas reset_index method on a pandas DataFrame, this method can be used to to fully or partially reset indices of a schema.

To remove the entire index from the schema, just call the reset_index method with default parameters.

>>> import pandera as pa
>>>
>>> example_schema = pa.DataFrameSchema(
...     {"probability" : pa.Column(float)},
...     index = pa.Index(name="unique_id", dtype=int)
... )
>>>
>>> print(example_schema.reset_index())
<Schema DataFrameSchema(
    columns={
        'probability': <Schema Column(name=probability, type=DataType(float64))>
        'unique_id': <Schema Column(name=unique_id, type=DataType(int64))>
    },
    checks=[],
    coerce=False,
    dtype=None,
    index=None,
    strict=False
    name=None,
    ordered=False,
    unique_column_names=False
)>

This reclassifies an index (or indices) as a column (or columns).

Similarly, to partially alter the index, pass the name of the column you would like to be removed to the level parameter, and you may also decide whether to drop the levels with the drop parameter.

>>> example_schema = pa.DataFrameSchema({
...     "category" : pa.Column(str)},
...     index = pa.MultiIndex([
...         pa.Index(name="unique_id1", dtype=int),
...         pa.Index(name="unique_id2", dtype=str)
...         ]
...     )
... )
>>> print(example_schema.reset_index(level = ["unique_id1"]))
<Schema DataFrameSchema(
    columns={
        'category': <Schema Column(name=category, type=DataType(str))>
        'unique_id1': <Schema Column(name=unique_id1, type=DataType(int64))>
    },
    checks=[],
    coerce=False,
    dtype=None,
    index=<Schema Index(name=unique_id2, type=DataType(str))>,
    strict=False
    name=None,
    ordered=False,
    unique_column_names=False
)>