Contribution Guide

We welcome contributions. To contribute, please follow these guidelines:

Pull Requests

All pull requests should be made to the development branch. Please use the following Naming Conventions for your branches:
- Feature Branches: feature-<feature_name> for introducing new features.
- Bug Fixes: fix-<bug_description> for resolving bugs.
- Hotfixes: hotfix-<issue> for urgent fixes that go straight to production.
- Improvements/Refactors: refactor-<description> or improvement-<description> for code improvements.
- Documentation: docs-<change_description> for updates to documentation.
- Experimental: experiment-<experiment_name> for trial and exploratory work.
Please make sure that your code passes all unit tests before submitting a pull request.
Include unit tests with your code changes whenever possible, preferably written in pytest format.
Make sure that all existing unit tests still pass with your code changes.
Please ensure that your code is compliant with the project's coding style guidelines, which include:
Writing docstrings in Scipy/numpy style format.
Using type hints in Python functions.
Adhering to the PEP 8 style guide for Python code.
No wildcard imports.
Import PySpark functions as F.
Providing well-documented and easy-to-understand code, including clear variable and function names, as well as explanatory comments where necessary.
If you are making a significant change to the codebase, please make sure to update the documentation to reflect the changes.
If you are adding new functionality, please provide examples of how to use it in the project's documentation or in a separate README file.
If you are fixing a bug, please include a description of the bug and how your changes address it.
If you are adding a new dependency, please include a brief explanation of why it is necessary and what it does.
If you are making significant changes to the project's architecture or design, please discuss your ideas with the project maintainers first to ensure they align with the project's goals and vision.

Issues

If you find a bug or would like to request a feature, please open an issue on the project's GitHub page. When opening an issue, please provide as much detail as possible, including:

A clear and descriptive title.
A description of the problem you're experiencing, including steps to reproduce it.
Any error messages or logs related to the issue.
Your operating system and Python version (if relevant).

Please search through the existing issues before opening a new one to avoid duplicates. If you find an existing issue that covers your problem, please add any additional information as a comment. Issues will be triaged and prioritized by the project maintainers.

If you would like to contribute to the project by fixing an existing issue, please leave a comment on the issue to let the maintainers know that you are working on it.

Getting Started

Installing Python

Before getting started, you need to have Python 3.8 or higher installed on your system. You can download Python from the official website. Make sure to add Python to your PATH during the installation process.

Alternatively, you can use Anaconda to create a Python 3.8 or higher virtual environment. Anaconda is a popular Python distribution that comes with many pre-installed scientific computing packages and tools. Here's how to create a new environment with Anaconda:

Download and install Anaconda from the official website.
Open the Anaconda prompt.
Create a new virtual environment with Python 3.8 or higher:

conda create --name myenv python=3.8
Activate the virtual environment:

conda activate myenv

Clone the Repository

Clone the repository to your local machine:

git clone https://github.com/ONSdigital/rdsa-utils.git
cd rdsa-utils

Set Up the Development Environment

We use a traditional setup.py approach for managing dependencies. To set up your development environment, first, ensure you have Python 3.8 to 3.13 installed.

Then, to install the package in editable mode along with all development dependencies, run the following command:

pip3 install -e .[dev]

The -e (or --editable) option is used to install the package in a way that allows you to modify the source code and see the changes directly without having to reinstall the package. This is particularly useful for development.

Running Tests

To run tests, ensure you're in the top-level directory of the project and execute:

pytest

This will run all the tests using the configurations set in the project.

Installing Pre-commit Hooks in Your Development Environment

Pre-commit hooks are used to automate checks and formatting before commits. Follow these steps to set them up:

Installation Steps

Install pre-commit: If you haven't already, install the pre-commit package:

pip install pre-commit

Install pre-commit hooks: Install the hooks defined in .pre-commit-config.yaml:

pre-commit install

This sets up the hooks to run automatically before each commit.

Usage

The pre-commit hooks will automatically run on your modified files whenever you commit. To manually run all hooks on all files, use:

pre-commit run --all-files

This can be useful for checking your codebase.

By following these steps, your development environment for rdsa-utils will be ready, and you can start contributing to the project with ease.