Creating an analysis module
Once you have discussed your proposed analysis and filed an issue to get started, you're ready to create an analysis module.
We have provided a script create-analysis-module.py
you can use to make this process easier.
Before you begin, please review this documentation for an overview of the skeleton file structure of an empty analysis module.
Is this your first analysis module?
If you are creating your first analysis module, we recommend that you scope your first pull request to contain only:
- The analysis module skeleton created by running
create-analysis-module.py
- Some initial documentation of your module in the
README.md
file
This way, you can get some experience filing pull requests, undergoing code review, and using other Git skills before you really get going on the analysis.
Running the module creation script
You should use the create-analysis-module.py
script to create a skeleton module for your analysis in the analyses
folder.
Before running this script, you should determine two things:
- The name of your module
- You will provide this name as an argument to the script
- Whether you will use R or Python (or both!) in your module
- You can use certain script flags to add a template notebook and/or set up an environment (
renv
or conda) for your chosen language, as described below in detail- If you do not use additional flags, the script will create a skeleton module only
- You can use certain script flags to add a template notebook and/or set up an environment (
To run this script, take these steps:
-
Open a terminal window.
- You may wish to launch terminal from GitKraken so that you are automatically placed in the repository.
-
Make sure you are working in your
base
conda environment by runningconda activate base
. -
As needed, navigate using
cd
to yourOpenScPCA-analysis
fork. -
Run the module creation script:
# Create a module called `my-module-name` ./create-analysis-module.py my-module-name <additional flags here>
- The script requires one first argument: the name of your module.
- The example code above will creates an analysis module called
my-module-name
- The example code above will creates an analysis module called
- You can also use an additional flag to add a template notebook or set up an environment with
renv
or conda
- The script requires one first argument: the name of your module.
Tip
You can always run the script with the --help
flag to reveal the help menu and see other flags you can use:
Module workflows
The create-analysis-module.py
script will also create two additional files besides your analysis module.
These files, stored in the repository folder .github/workflows
, are GitHub Action workflow files that the OpenScPCA project uses to ensure module reproducibility.
The workflows are disabled by default.
run_{my-module-name}.yml
contains a skeleton workflow for testing the analysis module. Learn more about module testing workflows here.docker_{my-module-name}.yml
contains a skeleton workflow for building the analysis module's Dockerfile
Please commit these files as part of your first pull request, and we'll take care of the rest!
Module creation script flags
We recommend using one of these flags when creating your module. Each flag will create a skeleton module with the given files and folders.
Note that you can use both an R and a Python flag if you want to write your module in both languages.
Each example below shows the resulting module file structure when using each flag.
Flags to create an R module
The --use-r
flag
Use this flag to add a template R notebook to your module:
# Create a module called `my-module-name` with a template R Markdown notebook
./create-analysis-module.py my-module-name --use-r
- You can use
notebook-template.Rmd
as a starting point for any R Markdown notebooks you create while writing your analysis
The --use-renv
flag
Info
The hello-r
example module was created with this flag.
Learn more about using renv
to manage your R environment.
Use this flag to:
- Add a template R notebook to your module
- Initialize an
renv
environment for your module
# Create a module called `my-module-name` with a template R Markdown notebook
# and an `renv` environment
./create-analysis-module.py my-module-name --use-renv
- You can use
notebook-template.Rmd
as a starting point for any R Markdown notebooks you create while writing your analysis - You can use
components/dependencies.R
to pin R package dependencies thatrenv
does not automatically capture - These additional files and folders manage the
renv
environment, and you should not directly edit them:renv.lock
- The
renv
folder .Rprofile
Flags to create a Python module
The --use-jupyter
flag
Info
The hello-python
example module was created with this flag.
Learn more about using conda to manage your Python environment.
Use this flag to:
- Add a template Jupyter notebook to your module
- To add a template Python script instead, use
--use-python
- To add a template Python script instead, use
- Initialize a conda environment for your module
- The conda environment will be named
openscpca-<module name>
- For example, if you name your module
celltype-ewings
, its conda environment will be namedopenscpca-celltype-ewings
- The conda environment will include an installation of Jupyter that you can launch with the
jupyter lab
command when the environment is active
- The conda environment will include an installation of Jupyter that you can launch with the
- The conda environment will be named
# Create a module called `my-module-name` with a template Jupyter notebook
# and a conda environment with Jupyter installed
./create-analysis-module.py my-module-name --use-jupyter
# Or, create a module called `my-module-name` with a template python script
# and a conda environment (Jupyter not installed)
./create-analysis-module.py my-module-name --use-python
- You can use
notebook-template.ipynb
(orscript-template.py
) as a starting point for any Jupyter notebooks (or Python scripts) you create while writing your analysis - You can use the
environment.yml
file to add additional packages to your module's conda environment
The --use-conda
flag
Use this flag to initialize a conda environment in your module, but without a template script or notebook.
The conda environment will be named openscpca-<module name>
.