CPE Q3 Achievements 2020
Saturday, 17, October 2020 Aoife Moloney Infra No Comments

Hi there,

I'm Aoife Moloney. You may remember me from such communications as the CPE office hours, Data Centre - what it means for you, and The Future of Communishift.

Over the last three months, the Community Platform Engineering team (or CPE for short as it's long to keep typing) have been working on a few projects, and generally surviving 2020 like everyone else. But we made it, and so did our projects! Mostly… 🙂

 

Over the last three months we worked on:

  • The Great Fedora Data Centre Move of 2020
  • Noggin
  • CentOS Stream
  • Packager Workflow Healthcare (Always check with your maintainer before taking this workflow. Side effects may be, but not limited to, frustration, tears, elation at successful builds)
  • Fedora-Messaging Schemas

 

We also had our long standing (and long suffering) ‘sustaining team’ on the front lines who are daily maintaining and running both the Fedora and CentOS infrastructures and responding to issues, bugs, etc. And doing a damn fine job too.

 

And we attended and participated in a few conferences too, namely Nest with Fedora & DevConf US.

 

So, what did we as a team overall achieve in these last few months?

 

CPE Infra & Releng Team 

This team was led by Pingou, and its members in Q3 were Mark O’Brien, Michal Konecky, Fabian Arrotin, David Kirwan, Kevin Fenzi, Vipul Siddharth, Stephen John Smoogen & Tomas Hckra.

This team is a sub team of CPE and focuses on lights on work in both the Fedora and CentOS infrastructures. We will always have some of our team members working in this way each quarter as it is good to have a break from scheduled project workloads and take a foray into the (sometimes) chaotic world of infrastructure maintenance, aka FIRE!!! 🙂

What they did: 

  • Changed their name. May we present: CPE Infra & Releng Team - oooooh, aaaaah!
  • Vipul & David worked with Fabian in the CentOS infra and did something with openshift clusters & migrated the kojihub for https://cbs.centos.org to a new infra
  • Kevin and Smooge moved all of the Fedora infrastructure. 117 servers.  Let that number sink in.
  • Pingou & Michal did a ton of babysitting toddlers 🙂 They moved a lot of scripts over and things are working well
  • Tomas helped bootstrap F33 - oh yeah!
  • And Mark had (not biologically,  but in sentiment) a baby! And became an admin for some of the Fedora Infrastructure.

Why its good:

  • The name change represents what this team works on and is easy to understand instantly. Plus, naming is hard so we wanted to keep it simple 🙂
  • Helped release F33 beta
  • We doubled down on toddlers allowing to build more automation around the infrastructure
  • Assisted with the fedora datacentre move for minimal disruption to the fedorans day to day lives
  • Helped keep CentOS CI operational - and then helped put out the flames when it caught fire 🙂
  • Over 500 tickets across both Fedora & CentOS infra + releng resolved by this team - that is some seriously good firefighting!

 

Fedora Data Centre Move

This dynamic duo was Kevin Fenzi and Stephen Smoogen, with supporting cast members from both CPE and the community along the way. The goal of this project was to successfully move a (large) number of the Fedora infrastructure hardware from one datacentre to the other without too much chaos. And considering the world wide pandemic that happened right at the beginning, they did a pretty fine job succeeding. Some additional services are still being added to the infrastructure in its new home in IAD, so if you notice a few still missing, we are getting to them slowly but surely and thank you again for your patience and understanding during these last few months!

What they did: 

  • Moved a ton of servers across the country of the United States
  • Kept critical services in Fedora Infrastructure alive during the move
  • Worked an uncountable amount of hours!

Why its good:

  • We got some new hardware!
  • The team carried out some resilience testing in the new data centre which means more reliability for the infrastructure should bad things happen 
  • Updated records and warranties were a passive benefit of this move too

 

Noggin

This team was led by Aurelien Bompard, and its members in Q3 were Ryan Lerch, Nils Philippsen & James Richardson. The goal of this project is to replace the current FAS system with a newer one and migrate the CentOS accounts to the one FAS instance (Noggin), which will mean our team has one authentication system to maintain for two infrastructures long term. This team has been working to a November 2020 deadline, but unfortunately during Q3 the team faced a number of challenges such as a delayed staging environment to test in due to the data centre move, then when we got it, realized their plugin they spent time developing was not going to work long term and now have to redo a bit of work in Q4. There were also a lot of holidays and personal events for the team in Q3 because everyone is human and entitled to a life 🙂 They have re-scoped their work for Q4 to make sure what's delivered is sustainable and reliable long term, more people have joined the team including some sys-admin for support along the way, and are now looking at delivering Noggin in full by the end of January 2021. 

What they did: 

  • A lot of ipsilon investigation
  • Added a spam curtailer service to Noggin
  • Added an agreements section for users to select their user preferences 
  • Deployed Noggin to staging but found out the way they did it wont be good for the project long term
  • Had a little cry about developing a plugin unnecessarily, hugged it out and then re-planned dates and the work we need to do in Quarter 4 to be able to deliver a better, more reliable and robust service in January 2021. Queue Noggin’  - Rise of the Phoenix Project

Why its good:

  • We knew where we went wrong, learned a lot both technically and as a project team for it, and were able to call the mistakes out and get the support we need to get the project back on track. Just a little bit later than we wanted.
  • We still created a solution that will meet the needs of both the CentOS and Fedora community users, and once we have the correct configurations in place and are ready to be tested we look forward to your feedback!

CentOS Stream:

This team was led by Brian Stinson, and its members in Q3 were Johnny Hughes, Carl George, Mohan Boddu, Leonardo Rosetti, James Antill & Siteshwar Vashisht

What they did: 

  • A lot of darn package & module building
  • Light hearted threats to their PO to teach her how to convert a CentOS Linux distro to CentOS Stream using the new release package - which she did! 🙂
  • Kept CentOS Stream compose up to date with RHEL nightlies
  • Launched the centos-stream-release package - Big deal. Like, huge.

Why its good:

  • CentOS Stream is continuing to stand on its own and becoming a more robust distro
  • There's lots more content in Stream for its users
  • Users can now swap from CentOS Linux to Stream easily

Packager Workflow Healthcare:

This team was led by Will Woods and its team members were Adam Saleh and Stephen Coady with Pingou in a part-time consulting/reviewing role. The team took a look into the Fedora packager workflow and tried to identify weaker points in the chain, and spot times that are more prone to downtimes.  They are finalizing a report of their findings to send to the community lists with hopefully a ‘next steps’ section that they feel will help reduce the issues packagers face sometimes in Fedora. Its a work in progress, but to have some data to read and understand is a great launching point.

What they did:

  • Refined the monitor-gating script that monitors the packager pipeline to enhance its performance
  • Picked a certain date range and got a database dump to pull metrics from into graphana to chart uptimes of applications within the pipeline
  • Created a diagram of the pipeline to help understand how packages flow through the fedora infra

Why its good:

  • The diagram of the packager workflow process is a great resource for both packagers and new contributors of the fedora community to refer to and help understand how things work
  • The team also have some recommendations they are working through with management and the wider CPE team to identify possible next steps and how we can improve the packager experience long term by adopting better monitoring.

 

Fedora-Messaging Schemas:

This project was also being worked on by the Noggin team part-time, so Aurelien Bompard, Nils Philippsen & Ryan Lerch. We needed to pause this work around the start of September and we hope to be able to return to it over the next quarter - October, November & December.

The guys have a github board here with a cookie-cutter schema available and a list of apps they were working on, so if you want to help out on this one, please feel free to visit the board and grab a card! 🙂 

What they did: 

  • Created a board to track the work being done and whats left to do https://github.com/orgs/fedora-infra/projects/7
  • Created a template schema 
  • Created a list of applications that require a schema update
  • Added some schemas to applications that need them

Why its good:

  • This will help us progress the retirement of fed-msg in 2021
  • It will also give applications, and application maintainers, access to new fedora messaging schemas for more faster & reliable notifications.

 

And that, my dear friends, is Quarter 3 for CPE.

Take care all, and see you around IRC! 🙂

 

Aoife

Leave a Reply

Your email address will not be published. Required fields are marked *