Enhancing the Development Environment

The current state of the development environment of the github2fedmsg project leaves a lot to be desired. This document attempts to address all the existing tech debts bound to the github2fedmsg project repository.

Investigations

Continuous Integration

Apparently, there has been a continuous integration configured for the repository, which was last active on Jun 17, 2020 on PR #27. This continuous integration was set up on a Jenkins instance of Fedora Infrastructure on the CentOS CI namespace (as deduced by the details URL) but is either no longer available or can only be accessed via the Red Hat VPN. If the former is the case, we would want to reconfigure the continuous integration in the project repository but if latter is the case, we would want to ensure that the continuous integration is accessible to the wider community to promote contributor participation in the project.
In order to set up a continuous integration in the project repository, GitHub Actions can be put into use as the project repository itself is hosted on GitHub, thus allowing for integration and support to a greater extent. It is also possible to use Fedora Linux containers in GitHub Actions, as evidenced by their use in Meetbot Logs, CentOS Duffy, Fedora Messaging Notifier, MDAPI and many more such projects, which would help replicate the production environment accurately for the relevant build and test jobs. Other continuous integration options like Zuul CI, Travis CI etc. can also be considered on the basis of particular needs.

Development Environment

A small amount of information about setting up a virtual environment and installing the project using setuptools is currently available on the README.rst document in the project repository. Although, the current setup works fine in a specific environment having (now EOLed) Python 2.7.x, it can be useful to abstract the use of setuptools, using dependency management tools like poetry, pipenv etc. to steer away from the unnecessary attention in creating and updating the setuptools configuration, facilitating for automatic dependency updates and maintaining a list of dependencies specific to development, testing and production scenarios.
Maintaining codebase to uphold established quality standards will be of great help, both, for the convenience of existing contributors and to help onboard potential ones. In the current setup, flake8 is used for linting the codebase. In addition to flake8, isort and black can be used for sorting of dependency imports and enforcing code styles wherever necessary. This has been of great help in maintaining code quality, as evidenced by their uses in Meetbot Logs, CentOS Duffy, Fedora Messaging Notifier, MDAPI and many more such projects.

Automatic Dependency Updates

Projects such as Meetbot Logs, CentOS Duffy and MDAPI use GitHub Dependabot to help automatically update the dependency lockfile as and when new updates and vulnerability fixes come out for the unexcluded dependencies, matching the version requirement mentioned in the project configuration file. While the former case requires having a valid GitHub Dependabot configuration in the repository, the latter does not and will work regardless of having it or not. Being GitHub’s own security worker bot, it has a greater extent of integration with the project repositories available on GitHub.
Projects such as Fedora Messaging Notifier and Anitya use Renovate to help automatically update the dependency lockfile as and when new updates or vulnerability fixes come out for the unexcluded dependencies, matching the version requirement mentioned in the project configuration file. Renovate is highly flexible with its configuration, allows for checking of dependency lockfile for verifying checksums, organizing custom schedules for updates and also supports auto-merging of the dependency update pull requests.

Outdated dependencies

The project makes use of a certain set of dependencies which are either left in a pre-release state or are no longer actively maintained. While these dependencies can be put to use in the project in its current state, it is very likely that a lot of security vulnerabilities must have flown under the radar unpatched, in the absence of features updates, security patches and bug fixes. Apart from that, being older libraries/frameworks - the availability of updated documentation is likely to be scarce for these, thus making the development using these dependencies more challenging for the maintainers.
Moving forward, we would want to look into either possible replacing the following dependencies or avoiding their use entirely.
1. Velruse, had its last update almost 9 years back in Aug 30, 2013.
2. WebError, had its last update almost 6 years back in Apr 10, 2016 and the project documentation states that the project is no longer actively maintained.
3. pyramid-mako had its last update almost 3 years back in Aug 19, 2019.
4. transaction had its last update almost 2 years back in Dec 11, 2020.

Contributor’s Quality of Life

Certain additional documentations such as CODE OF CONDUCT, CONTRIBUTING GUIDELINES etc. can either be added or linked in, to set the expectations right for both existing and potential contributors. The wiki feature present in the project repository can be preferred to be put into use for keeping documentation like these which are less likely to change across multiple releases. These documentation are available in repositories such as MDAPI [1] [2], Meetbot Logs [1] [2], CentOS Duffy [1] etc. and can be included here too.
Tests and code coverage should be added to ensure that the project code remains functional over time across the addition of features, fixing of bugs, patching of security vulnerabilities etc. The tests currently available in the project repository have a limited scope and are not automated using continuous integration to test out the code on pull requests, on pushes and before deployment workflows. These tests and code coverage have been implemented in Meetbot Logs, CentOS Duffy, Fedora Messaging Notifier, MDAPI and many more such repositories and are automated using GitHub Actions.

Automating Deployments to OpenShift

The project currently makes use of semantic versioning for its releases and requires manual intervention for deploying every distinct release to both staging and production environments. The effort can be reduced by the use of webhooks which would help monitor distinct release on the project repository and then automatically run deployment scripts to put the new version up on the staging environment. For this, GitHub webhooks can be configured to be used on the said OpenShift cluster and the production deployment is suggested to require manual intervention to make way for additional due diligence before deployment.