diff --git a/.github/README.md b/.github/README.md new file mode 100644 index 0000000..ba39d7b --- /dev/null +++ b/.github/README.md @@ -0,0 +1,2 @@ +# Github config files +This directory is used to define templates for pull requests and issues, alongside defining the CI/CD in the "workflows" directory. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index a2c4571..0035bca 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,29 +1,41 @@ # How to contribute -First off, thank you for taking the time to contribute! If you have a function that you would like to see in codon, we have a few standards and guidelines that we would like you to follow before we consider your merge request. Failure to follow the contribution guide will result in your merge request being challenged or rejected. +First off, thank you for taking the time to contribute! If you have a function that you would like to see in the National Reusable Code Library, we have a few standards and guidelines that we would like you to follow before we consider your merge request. Failure to follow the contribution guide will result in your merge request being challenged or rejected. -We are looking for functions and/or classes which are useful for workflows in DIS specifically. Please do not submit the following... -* End-to-end scripts cannot be implemented into the package. If you find something reusable within your end-to-end script, then please feel free to extract it and submit it to Codon with tests attached. -* Multiple functions that are unrelated. For example, "This function takes an integer number and rounds it to the nearest whole number". Please do not include multiple functions unless they're methods of a class or are related to the same file (i.e. two methods of suppression) +We are looking for reusable functions and/or classes which are useful across multiple workflows in NHS data processing and analytics. Please do not submit the following... +* End-to-end scripts that cannot be implemented into the package. If you find something reusable within your end-to-end script, then please feel free to extract it and submit it with its unit tests attached. * Duplicated functionality. For example, if your function is already done by another well known package. * Irrelevant functionality. If the function you submit is unrelated to DIS, it will mostly likely be challenged or rejected. -## Basic idea +When making a submission, it is helpful if this is kept small, and that the content is all related / follows a theme / are different methods of a class. It will make review and acceptance more straightforward. -1. [Fork](https://help.github.com/en/articles/fork-a-repo) codonPython on GitHub. +## Eligibilty +Anyone is free to suggest contributions this repository, access it, and utilise its contents. However, as a number of steps for the approval process take place within NHS England, if you're external it might be wise to email the repository owners to discuss your submission so that they can bring it through the governance and approval process within NHS England. Otherwise you might end up waiting until someone notices a new pull request and considers it. -2. Write your documented function and tests (:heart_eyes:) on a new branch, coding in line with our **coding conventions**. +The process for submitting new code is: -3. Submit a [pull request](https://help.github.com/en/articles/creating-a-pull-request) **to the dev branch** of codonPython with a clear description of what you have done. +1. Identify the reusable components within your code and refactor them to be standalone functions / classes (e.g. no integrated or dependent on the work they were originally a part of) +2. [Fork](https://help.github.com/en/articles/fork-a-repo) this repo on GitHub. +4. Write your documented function and tests (:heart_eyes:) on a new branch, coding in line with our **coding standards**. +5. Submit a [pull request](https://help.github.com/en/articles/creating-a-pull-request) **main branch** of the National Reusable Code Library with a clear description of what you have done, fully completing all sections of the `pull_request_template.md` (the empty template should appear automatically in the pull request). +6. The maintainers of the repository will review the code, and, if it meets the basic standards set out here, take if for consideration to the Reusable Code Advisory Group within NHS England. +7. If rejected, feedback will be provided, and once this has been actioned it can taken back to the Advisory Group +8. IF accepted, the pull request will be merged into the main branch. +9. At regular interverals, the code will be packaged up and released - the commit will be tagged with a semantic version, following the usual approach of trivial, minor and major (breaking) changes. These versioned packages will be published to PyPi and potentially other package repositories. We suggest you make sure all of your commits are atomic (one feature per commit). Please make sure that non-obvious lines of code are commented, and variable names are as clear as possible. Please do not send us undocumented code as we will not accept it. Including tests to your pull request will bring tears of joy to our eyes, and will also probably result in a faster merge. -## Coding conventions +## Coding standards -We use the industry standard [PEP 8](https://www.python.org/dev/peps/pep-0008/) styling guide within the `codonPython` package. **Therefore, it’s imperative that you use the coding standards found within PEP 8 when creating or modifying any code within the `codonPython` package**. Autoformatters for PEP8, for instance [black](https://black.readthedocs.io/en/stable/), can easily ensure compliance. The reason we use PEP 8 coding standards is to make sure there is a layer of consistency across our codebase. This reduces the number of decisions that you need to make when styling your code, and also makes code easier to read when switching between functions etc. +In short: +- Follow PEP-8, and use Black to enforce this on your code +- Have unit tests for all your functions / methods +- Write [Numpy-style](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard) docstrings for functions, including descriptions, inputs, outputs, and example usage +- Methods used in the code should have been approved by the appropriate body, e.g. the Data Quality Steering Group for DQ rules. For external submissions, this might require a longer review process in which these are brought to those bodies. -While you are creating code, we recommend that you understand the style guide standards for the following topics: +We use the industry standard [PEP 8](https://www.python.org/dev/peps/pep-0008/) styling guide. **Therefore, it’s imperative that you use the coding standards found within PEP 8 when creating or modifying any code within the repository**. Autoformatters for PEP8, for instance [black](https://black.readthedocs.io/en/stable/), can easily ensure compliance. The reason we use PEP 8 coding standards is to make sure there is a layer of consistency across our codebase. This reduces the number of decisions that you need to make when styling your code, and also makes code easier to read when switching between functions etc. +The following articles can be useful when writing code to the standard: * [Code layout](https://www.python.org/dev/peps/pep-0008/#code-lay-out) – Indentation, tabs or spaces, maximum line length, blank lines, source file encoding, imports & module level Dunder name * [String quotes](https://www.python.org/dev/peps/pep-0008/#string-quotes) * [Whitespace in expressions and statements](https://www.python.org/dev/peps/pep-0008/#whitespace-in-expressions-and-statements) – Pet Peeves, alternative recommendations @@ -32,21 +44,11 @@ While you are creating code, we recommend that you understand the style guide st * [Naming conventions](https://www.python.org/dev/peps/pep-0008/#naming-conventions) – Naming styles, naming conventions, names to avoid, ASCII compatibility, package and module names, class names, type variable names, exception names, global variable names, function and variable names, function and method arguments, method names and instance variables, constants & designing for inheritance * [Programming recommendations](https://www.python.org/dev/peps/pep-0008/#programming-recommendations) – Function annotations & variable annotations -We also use docstrings and we try to follow [`numpy`'s docstring standards](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard). -Start reading our code to get a feel for it but most importantly, remember that this is open source software - consider the people who will read your code, and make it look nice for them. +## Community -* We use [PEP8](https://www.python.org/dev/peps/pep-0008/). Autoformatters for PEP8, for instance [black](https://black.readthedocs.io/en/stable/), can easily ensure compliance. -* We use docstrings and we try to (loosely) follow [`numpy`'s docstring standards](https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard). -* This is open source software. Consider the people who will read your code, and make it look nice for them. - -## Tests - -We do ask that you include some basic tests with your contributions. While the logic of your contribution is important, some basic unit tests to verify functionality and data types for the inputs are requested for a baseline level of assurance and 'elegant failing'. +The discussions space on this github repo is the perfect place to discuss the code and reach out to those working on it. ## Code of Conduct -As a contributer you can help us keep the Codon community open and inclusive. Please read and follow our [Code of Conduct](https://github.com/codonlibrary/code-of-conduct/tree/master). By contributing to it, you agree to comply with it. - -:clinking_glasses: Thank you! -Team codon +As a contributer you can help us keep the this community open and inclusive. Please read and follow our [Code of Conduct](https://github.com/codonlibrary/code-of-conduct/tree/master). By contributing to it, you agree to comply with it. diff --git a/LICENSE b/LICENSE.md similarity index 91% rename from LICENSE rename to LICENSE.md index 4fb606d..e87e20c 100644 --- a/LICENSE +++ b/LICENSE.md @@ -1,4 +1,6 @@ -Copyright 2019 NHS Digital DIS Team +Copyright 2025 NHS England Data Architecture Team + +# BSD 3-Clause Licence Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: @@ -8,4 +10,4 @@ Redistribution and use in source and binary forms, with or without modification, 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. \ No newline at end of file +THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. diff --git a/README.md b/README.md index f5fa278..ccd7cc6 100644 --- a/README.md +++ b/README.md @@ -24,28 +24,62 @@ By making code reusable, and making it easy to reuse, this work aims to: **Be more cost effective**: Reusable 'generalised' code will increase efficiency in creating higher level processes. - ## Installation -The package can be directly installed by typing in your terminal: -```r -# TBC +The package can be directly installed as a python package from PyPi by in your terminal: +```terminal +pip install nhs_reusable_code_library ``` Other platform specific instructions to follow. +## How to use the National Reusabel Code Library package +When using Python, given that you've installed the package as described above, you can simply [import](https://docs.python.org/3/tutorial/modules.html#packages) it as normal: + +```Python +from nhs_reusable_code_library.standard_data_validations.nhsNumberValidation import mod11_check + +nhs_number = '1111111111' + +nhs_number_valid = mod11_check(nhs_number) +``` + +## Code Manfiest +(Made using "[project-tree-generator](https://project-tree-generator.netlify.app/generate-tree)") + +``` +Reusable-Code-Library/ +├── .github/ # Directory for github specific templates and CI/CD (github actions) +│ ├── ISSUE_TEMPLATE/ # templates for when people raise issues +│ ├── pull_request_template.md # template used when a pull request is raised +│ └── workflows/ # Github Actions (CI/CD) pipelines go here +│ └── ci.yml # This is the Continuous Integration pipeline which runs the unit tests and tests the package builds +└── src/ +│ └── nhs_reusable_code_library/ # the main package directory which will have a number of libraries +│ ├── standard_data_validations/ # the place for data quality rules code +│ │ ├── nhsNumberValidation/ # NHS number validation related code +│ │ ├── polars/ # Polars implementations of data quality rules code +│ │ └── pyspark/ # PySpark implementations of data quality ruls code +│ └── tests/ # the unit tests for the functions within the package +├── .gitignore # tells the repo which files to ignore, e.g. temporary, hidden and background files, and outputs. +├── CONTRIBUTING.md # Describes how to contribute to the repository +├── LICENSE # Describes the License the code can be used under. +├── pyproject.toml # Used when building the package +└── README.md # Describes what the package is for and how to use it. +``` + +## Governance +New reusable code is discussed and signed off in the Reusable Code Assurance Group within NHS England. This group also sets the standards this code is made to. +New code must have appropriate unit tests and all unit tests must pass before it can be merged into the main branch. These tests can be found in the `src/.../tests` folders. + ## Contributing -All new contributions to the `National Reusable Code Library` are welcome; please follow the Coding Conventions in the guidance document for contribution guidance. +All new contributions to the `National Reusable Code Library` are welcome; please follow the guidance document for contributions. Any improvements to documentation, bug fixes or general code enhancements are also welcomed. If a bug is found on the master branch, please use the GitHub guidance on raising an [issue.](https://help.github.com/en/github/managing-your-work-on-github/creating-an-issue) ## New to GitHub? -GitHub is a hosting site that allows for development and version control of software using Git. It allows users to edit and develop parts of code independently before submitting back to the master code, whilst using version control to track changes. Introductory videos to GitHub for beginners can be found [here.](https://github.com/codonlibrary/codonPython/wiki/2a.-GitHub-for-Beginners) - -Quick links to beginner guidance can also be found below: - -* [**Cloning a repository to your local machine using GitBash**](https://github.com/codonlibrary/codonPython/wiki/1.-Installing-codonPython) -* [**Checking out a branch using GitBash**](https://github.com/codonlibrary/codonPython/wiki/2b.-Checkout-a-branch-using-GitBash) -* [**Removing a Commit from a repository using GitBash**](https://github.com/codonlibrary/codonPython/wiki/3.-Removing-a-Commit-From-a-GitHub-Repository) +GitHub is a hosting site that allows for development and version control of software using Git. It allows users to edit and develop parts of code independently before submitting back to the master code, whilst using version control to track changes. Introductory guidance can be found here: [https://nhsdigital.github.io/rap-community-of-practice/training_resources/git/introduction-to-git/] -All other `codon` "How-to Articles" can be found [here.](https://github.com/codonlibrary/codonPython/wiki/2.-Git-Guidance) +## Acknowledgments +Thanks in particular to the amazing work of both the [NHS Digital RAP Squad](https://nhsdigital.github.io/rap-community-of-practice), and the [NHS Codon Project](https://github.com/codonlibrary/codonPython) who greatly inspired this work and set the foundations for it years ago. -Suggestions regarding additional guidance or How-to articles are welcome. +## Contact +NHS England Data Architecture Team diff --git a/docs/README.md b/docs/README.md index 6f6536f..6aadf9a 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1,3 +1,4 @@ # Documentation files -The files that Sphinx builds the documentation from are located here. ReStructuredText `.rst` files are text files similar to markdown which allow formatting and interactivity with Sphinx. As with the functions in codon, improvements to the documentation files here are welcolmed. +## ToDo +This currently contains the old files from the Codon project - it needs updating to automatically generate code documentation, e.g. via Sphinx.