Automated module testing

To maintain module functionality over time, we use GitHub Actions (GHAs) to periodically run each module ("module testing GHA") with the goal of testing that the module code runs to completion without errors.

Info

For more information about how we run modules to generate workflow results associated with OpenScPCA data releases, please see our documentation on the OpenScPCA-nf workflow.

Module testing GHAs are automatically run in two circumstances:

When a pull request is filed with changes to any module files
- This GHA will need to pass without errors for pull requests to be approved
On a periodic schedule
- This ensures that changes in data or other code do not break tests within each module

For examples of existing analysis module GHAs, see the example Python and R module GHAs, run_hello-python.yml and run_hello-R.yml, respectively.

To make GHAs run efficiently, the tests should run the module code with the simulated test data. This means that it's important to write your module code with sufficient flexibility to allow for test data to be used. You should read in files from the data/current directory, which will be automatically directed to test data during module testing GHA runs.

In addition, it's also helpful for your module to have a single entry point for running all module scripts and/or notebooks in their intended order, e.g. a shell script. This way, the module testing GHA can directly call this script to execute the entire module.

Writing a module testing GHA

Tip

The Data Lab will generally maintain and write module testing GHAs, but you are welcome to do so as well if you are interested! See this GitHub documentation to learn about workflow syntax for GHAs.

When you create a new module, a module testing GHA workflow file is created in the file .github/workflows/run_{module-name}.yml. This initial file is inactive, meaning it will not run automatically run on the two aforementioned triggers. As an analysis module matures, the Data Lab staff will activate this GHA file so the module can be regularly tested.

GHA steps

Each module testing GHA is initially created with these steps, which should be updated to reflect the given module's needs:

Checkout the repository
Set up the module environment
- Depending on the flags used when creating your module, these steps will install the renv and/or conda environment from existing environment files (renv.lock and/or conda-lock.yml, respectively).
Download test data
- Use the download-data.py and/or download-results.py scripts to specify the set of input files you need, with the --test-data flag to specify downloading the test data.
- After this step, the data/current directory will point to the test data, ensuring the module GHA runs using the test data.
Run the analysis module
- Generally, this will involve calling the module's run script.

As an analysis module matures, the GHA will be updated to run the analysis in the module's Docker image, rather than using the renv and/or conda environment files. Module testing GHAs can use their module's Docker image once it has been built and pushed to the registry.