This is a summary of the work done on initiatives by the Community Platform Engineering (CPE) Team in Red Hat. Each quarter, the CPE Team—together with CentOS and Fedora community representatives—chooses initiatives to work on in the quarter. The CPE Team is then split into multiple smaller sub-teams that will work on chosen initiatives, plus the day-to-day work that needs to be done.
Following is the list of sub-teams in this quarter:
The purpose of this team is to take care of day-to-day business regarding CentOS and Fedora Infrastructure and Fedora release engineering work. It’s responsible for services running in Fedora and CentOS infrastructure and preparing things for the new Fedora release (mirrors, mass branching, new namespaces etc.). This sub-team is also investigating possible initiatives. This is done by the Advance Reconnaissance Crew (ARC), which is formed from the Infra & Releng sub-team members based on the initiative that is being investigated.
In addition to the normal maintenance tasks (reboots, updates for security issues, creating groups/lists, fixing application issues) we worked on a number of items:
While taking care of day to day business like nightly composes, package retirements and unretirements, new scm requests and occasional koji issues, we worked on new Fedora release.
Investigated upgrading the Frontend Web UI for the CentOS mailing list. The investigation came to the conclusion that Mailman3, Postorius and Hyperkitty would need to be packaged for EPEL8. A new server would need to be deployed with the current CentOS mailing list migrated to it.
This initiative is working on CentOS Stream/Emerging RHEL to make this new distribution a reality. The goal of this initiative is to prepare the ecosystem for the new CentOS Stream.
One thing we tackled was enabling side tag builds for Fedora ELN. Initially, we wanted to implement proper side tags for ELN, but we eventually settled for a simpler approach where we tag the Rawhide builds in, and then rebuild them in ELN. This ensures that we get all the packages built in ELN, with the Rawhide build as a backup should it fail in ELN. And we can even use this as a health metric for ELN — how many ELN packages are actually ELN builds.
For CentOS Stream 9, we have cloud images in AWS available. You can get it by searching for "centos stream 9" in AWS, and to make sure you get the latest you can add this month (so "202110" for October 2021).
Also, CentOS Stream 9 repositories are now available through mirrors using a meta link. Existing systems get this set up automatically with an update, as the centos-release package will include this metalink. This will take some load off the CentOS infra and potentially even make your updates faster.
Goal of this initiative is to update and enhance Datanommer and Datagrepper apps. Datanommer is the database that is used to store all of the fedora messages sent in the Fedora Infrastructure. Datagrepper is an API with web GUI that allows users to find messages stored in Datanommer database. Current solution is slow and the database data structure is not optimal for storing current amounts of data. And here is when this initiative comes into play.
Datanommer and Datagrepper have been upgraded to use TimescaleDB, an open-source relational database for time-series data. TimescaleDB is a PostgreSQL extension that takes care of sharding the large amount of data that we have (and keep generating!), and maintains an SQL-compatible interface for applications.
Datagrepper and the Datanommer consumer are now running in OpenShift instead of dedicated VMs.
DNF Counting is used to obtain data on how Fedora is consumed by users. The current implementation experiences timeouts and crashes when the data are obtained. This initiative is trying to make the retrieval of counting data more reliable and efficient.
Scripts that create the statistics for https://data-analysis.fedoraproject.org/ were cleaned up and refactored, making them stable enough, so that they don’t require more manual intervention.
The code on https://pagure.io/mirrors-countme/ now has tests running in CI and is packaged as an rpm to avoid further mishaps in package installation. The deployment scripts were cleaned-up as well, alongside the actual deployment on log01 machine, with it’s hard-to-track manual interventions for last minute bug-fixes replaced by ansible-scripts.
Cron-jobs that run the batch-jobs now only send notification emails on failures and to see the overall health of the batch-process you can see the simple dashboard on - https://monitor-dashboard-web-monitor-dashboard.app.os.fedoraproject.org/
Goal of this initiative is to deploy OpenShift 4 in Fedora Infrastructure and start using Prometheus as a monitoring tool for apps deployed in OpenShift. This initiative should also define what metrics will be collected.
If you get here, thank you for reading this. If you want to contact us, feel free to do it in #redhat-cpe channel on libera.chat.