PDC replacement research

PDC is repository and API for storing and querying metadata related to packages and product releases. We rely on information in it to produce Fedora releases, manage retirement of packages and more.

Our current deployment running on https://pdc.fedoraproject.org/.

It uses https://github.com/product-definition-center/product-definition-center/tree/python-pdc-1.9.0-2

The software is orphaned by it’s original maintainers with version requiring EOL version of python and EOL version of Django Framework, this means we need to upgrade it or replace it.

Abstract take-away

There is no silver bullet, it will require development effort Responsibility for retirement and sla’s of packages could be moved to pagure-dist_git Tesponsibility for critical-path and releases can be moved to Bodhi Nightly compose metadata requires new application, that would be much simpler than current PDC

Previous work

There has been a proposal by Clement Verna with much of the usecases already mapped out, and a POC in fedora-infra/fpdc repo , but it was left for more pressing work.

Current users of PDC

Based on conversations with coleagues from Releng and Fedora QA, and cursory investigation of our repositories we use pdc API or through CLI client in:

  • releng scripts

  • fedpkg

  • fedfind

  • bodhi

  • pagure

  • modulebuildservice

  • mirrormanager scripts in ansible-repo

  • new hotness

  • fedora messaging

  • osbs client

As a part of this investigation we want to create exhaustive list, with analysis of the actual PDC use-case in each application.

Solutions to be explored

We have sevral options:

  • Upgrade PDC to supported version of Django and take over the maintenance

  • Use a database with a simple api gateway like Postgrest or Prest in front of it

  • Create a new bespoke application that better suits our needs

  • Incorporate the functionality we need to other applications we already have (Pagure/Bodhi/Datagrepper)

Currently we are proposing primarily the last option.

Preliminary notes on maintianing current PDC

This would require significant investment, as it would require several upgrades of underlying software and probably even a Python 2.x to 3.x migration as well.

Moreover, based on the discussions within CPE:

  • current PDC is more complex than we need

  • we want to avoid maintaining another application

Because of this, we won’t focus on this avenue of exploration.

Preliminary notes on using Postgrest

Based on our discussions, we mostly use simple CRUD API, which means that a new database-model that would better suit our needs could be all we need to migrate.

The biggest advantage of this approach is using off the shelf service, where only thing we need to maintain is the database-model, clients, and keeping the software up-to-date.

We have chosen two candidates to investigate based on the fact, we probably want to utilize Postgresql on the backend:

Postgrest as a simple api on top of a database model sounded promissing, but after investigating how to integrate it with ou currentaccount system, it turned out we would need to maintain a separate service that would serve as a bridge. - https://postgrest.org/en/stable/auth.html - https://samkhawase.com/blog/postgrest/postgrest_auth0_service/

This speaks against using Postgrest.

Create a new bespoke application that better suits our needs

We investigated using fastapi and fas, and the previously started fpdc proof of concept by Clement. Based on discussions with others, we decided to investigate way to merge the existing functionality to the services we already maintain.