PDC replacement research
PDC is repository and API for storing and querying metadata related to packages and product releases. We rely on information in it to produce Fedora releases, manage retirement of packages and more.
Our current deployment running on https://pdc.fedoraproject.org/.
It uses https://github.com/product-definition-center/product-definition-center/tree/python-pdc-1.9.0-2
The software is orphaned by it’s original maintainers with version requiring EOL version of python and EOL version of Django Framework, this means we need to upgrade it or replace it.
Abstract take-away
There is no silver bullet, it will require development effort Responsibility for retirement and sla’s of packages could be moved to pagure-dist_git Tesponsibility for critical-path and releases can be moved to Bodhi Nightly compose metadata requires new application, that would be much simpler than current PDC
Previous work
There has been a proposal by Clement Verna with much of the usecases already mapped out, and a POC in fedora-infra/fpdc repo , but it was left for more pressing work.
Current users of PDC
Based on conversations with coleagues from Releng and Fedora QA, and cursory investigation of our repositories we use pdc API or through CLI client in:
releng scripts
fedpkg
fedfind
bodhi
pagure
modulebuildservice
mirrormanager scripts in ansible-repo
new hotness
fedora messaging
osbs client
As a part of this investigation we want to create exhaustive list, with analysis of the actual PDC use-case in each application.
Solutions to be explored
We have sevral options:
Upgrade PDC to supported version of Django and take over the maintenance
Use a database with a simple api gateway like Postgrest or Prest in front of it
Create a new bespoke application that better suits our needs
Incorporate the functionality we need to other applications we already have (Pagure/Bodhi/Datagrepper)
Currently we are proposing primarily the last option.
Preliminary notes on maintianing current PDC
This would require significant investment, as it would require several upgrades of underlying software and probably even a Python 2.x to 3.x migration as well.
Moreover, based on the discussions within CPE:
current PDC is more complex than we need
we want to avoid maintaining another application
Because of this, we won’t focus on this avenue of exploration.
Preliminary notes on using Postgrest
Based on our discussions, we mostly use simple CRUD API, which means that a new database-model that would better suit our needs could be all we need to migrate.
The biggest advantage of this approach is using off the shelf service, where only thing we need to maintain is the database-model, clients, and keeping the software up-to-date.
We have chosen two candidates to investigate based on the fact, we probably want to utilize Postgresql on the backend:
Postgrest as a simple api on top of a database model sounded promissing, but after investigating how to integrate it with ou currentaccount system, it turned out we would need to maintain a separate service that would serve as a bridge. - https://postgrest.org/en/stable/auth.html - https://samkhawase.com/blog/postgrest/postgrest_auth0_service/
This speaks against using Postgrest.
Create a new bespoke application that better suits our needs
We investigated using fastapi and fas, and the previously started fpdc proof of concept by Clement. Based on discussions with others, we decided to investigate way to merge the existing functionality to the services we already maintain.