Rogers and Berriman Editing and Imputation System (RBEIS)

RBEIS is a method originally developed for imputing categorical data in relatively small social surveys with the intention of minimising conditional imputation variance. It is derived from CANCEIS, which is better suited to large datasets such as the Census. This implementation of RBEIS works with Pandas DataFrames.

RBEIS consists of a package rbeis, containing classes used by all implementations of RBEIS, and subpackages rbeis.*, containing implementations of the impute method using various backends. Currently, only rbeis.pandas has a complete implementation, although future support is planned for PySpark in the rbeis.spark package.

To run an imputation, you will need to import impute and RBEISDistanceFunction by calling from rbeis.pandas import impute (using the Pandas backend) and from rbeis import RBEISDistanceFunction.

Prerequisites

RBEIS was developed in an environment requiring support for Python 3.6.8, pandas 0.20.3, numpy 1.13.1 and wheel 0.29.0. It may work with newer versions of these packages, but this is untested.

Installation

The latest RBEIS wheel is available via GitHub. Download the .whl file and call pip install /path/to/wheel.whl to install it.

Note

RBEIS does not yet have ONS approval to be published through PyPI. When this is given, it will be able to be installed easily using pip.

RBEIS is licensed under the MIT License.