Does a top client of mine pass the continuous delivery test?

I'm taking the test I found online from:

https://sourceless.org/posts/the-continuous-delivery-test.html

There are some motivations behind this test that I'll leave to the original author, however I wanted to see how one of my main clients stacks up against these questions.

Taking The Continuous Delivery Test

1. Do you use a distributed version control system?

Yes - and I couldn't be happier. Git is the source control medium of choice, however integration with a continuous delivery and integration platform is very thin if not non-existent.

I think there's been movements in the past to trigger builds on every push to a particular branch, however the builds would just sit there unless someone pushed another few sets of buttons to deploy it to an environment. By the time people were ready to do this, the developers had already pushed another dozen or so commits and created as many builds making testing the the current build pointless.

I'd say this was a pass for the most part.

2. Do you practice trunk-based development?

I tried to champion this years back but since there were some difficult CI/CD workflows to consider as well as strong feelings towards long-lived feature branches being able to span multiple sprints. Also, there was an unconfortable feeling by the release manager about losing release branches.

The current state of the world today is that all developers merge code to the trunk branch when their code is considered "ready".

There is a Selenium automation suite that can be run by a developer on a local machine, however this test suite is prone to errors, false negatives, and lengthy runtimes. The test suite is suggested to be ran on every commit, however that's hard to make a reality given the aforementioned problems. With that in mind, a developer is never sure if they broke something or if the tests themselves are broken.

This has led to a main branch that is not always deployable and sometimes doesn't build.

Given that, this feels like a fail.

3. Do you merge little and often?

Merging small chunks to main is not something that is widespread amongst all of the teams. I still see several long-lived feature branches that go through many rebases and are populated with merge-commits here and there.

Changes are usually confined to a specification captured in a ticket. Tickets are sized with story points and developers pick them up but don't update status when the time drags on. This might be due to a lack of periodic calibration of the story point sizing.

Given that, this feels like a fail.

4. Do two people read code before it is merged?

People definitely read the code in PRs, but not a lot of analytical thinking happens unfortunately. PRs tend to serve more like a quick scan if anything. I have observed seen push back from PR authors when questions or concerns are raised, which is too bad.

Pair or mob programming is uncommon. This seems to only happen on very large projects or when there is a critical fix to go out that must be deployed post-haste.

There are some controls on the GitHub UIs to require approval for merges, however most if not all users have admin rights so most of the time this check is bypassed.

Given that, this feels like a fail.

5. Do you require changes to pass checks before they can be merged?

There are no blocking CI/CD checks in place preventing merging. There have been attempts to have tests and linters run on precommit checks, however these implementations were never consistent and not reliable due to things like test coverage and insufficient linting rules.

Overall, I'd say fail but not by much. The functionality is there, it just needs to be implemented further.

6. Can you test your changes in a production-like environment before you deploy them?

Short answer, no. Long answer, it depends.

On a development machine, all of the applications are right next to each other so speed is instantaneous. The OS and applications versions are the same, however the setup doesn't match what's on production machines due to the way machines are provisioned and configured.

On a production-like environment, OS and application versions all match what's on a development machine, however things like load balancing, CDNs, speed and memory etc aren't there just due to cost overhead and the added maintenance burden of having to maintain all of that infrastructure. This has been something my client has been trying to fix for many years and after spending many millions of man-hour dollars I'm sure, it is almost in a position where this is possible.

This is definitely a pass. It's definitely better than having no environment testing.

7. Do you deploy to production as soon as main is updated?

Sadly this answer was quickly too obvious after the responses to the answers in #2 through #4, as well as #6.

There's simply too much manual stuff that needs to happen, and there's too big of a cycle time between when a commit is ready versus a full build of the system representing that commit is ready to be deployed.

Given that, this feels like a fail.

8. Does your deploy process self-heal?

No and this is a very common wishful thinking feature developers like to use to ease current suffering whenever this topic comes up in Sprint retrospectives. No work has ever been done to bring the client any closer to making this a reality. I'm not even sure what self-healing is even supposed to mean.

Release rollbacks are never an option due to the way the application is designed. If something breaks, it's an all-hands on deck affair that results in at least one hotfix release. Sometimes post-mortems are created.

Blue-green releases, canarying (also brought up in discussion re: feature flags below) and healthchecks / outage detection are also something that is represented more by wishful thinking than implementation. There are some healthchecks, but they're either much too low level to make provide any immediate indications to an untrained observer or provide too many false-positives to yield a reliable indicator.

Given that, this feels like a fail.

9. Does your Infrastructure as Code live alongside the service it hosts?

Yes, and no.

The application code does live in a monorepo which is great, and lots of the ongoing work to develop the infrastructure-as-code system for ephemeral environment creation is an a repo as well.

The only problem is there is no correlation between the two as far as submodules or build numbers etc. The mechanism for this is still something that is managed by hand.

This feels like a pass even though the IaC is still in development and the monorepo and IaC are loosely coupled.

10. Do you use feature flags?

Yes, but not really in a way that's in the spirit of feature flags. What I have observed is that developers tend to write them in a way that creates dependent code on the flag with code that is now permanently interleaved with the rest of the application.

I read a great article by Martin Fowler on feature toggles, and it really gave me a new perspective about how this sort of stuff should be done.

https://martinfowler.com/articles/feature-toggles.html

Fowler went into great detail about how the code behind a feature should be easily unpluggable from the rest of the main application, similar to the way dependency injection works.

The way feature flags have been implemented with this client is they are part of the release themselves and, using Mr. Fowler's definition, almost every one is used as an Ops toggle.

Long story short, code is still released to users as soon as a new release goes out. If it breaks, the developers need to use that Ops toggle to disable it (assuming it works correctly) and push a hotfix release to fix the issue before reenabling the Ops toggle.

Given that, this feels like a fail.

11. Do you include ticket IDs in your commits or branches?

I have always prided myself on this, and is something I've done for every client I've written code for.

Other developers also tend to follow this methodology for the most part, However I do see things like "wip" or generic commit messages that isn't easily traced back to a ticket or work item.

Definitely a pass here.

12. Can you still deploy from your own machine?

I can't technically deploy code from my machine, however like I mentioned in previous answers the client is making significant investment in completing a vast infrastructure-as-code project that would allow a developer to deploy a build using Terraform with all the environment variables, overall system configuration etc from a single machine and not have to login to a web portal.

Still, I can login to a web portal to create a build, assign it to an environment and deploy it from my own machine which I believe satisfies the condition of the question.

Definitely a pass here.

13. Can you show what will happen when a branch is merged?

This is something I have never really heard of but I'm pretty certain the answer is a "No".

There is potential for some of the IaC code to produce a diff of the state of the system before and after changes, as well as what changed, however that really doesn't give a good visualization of the state of the system to an outside observer at any given point in time.

Given that, this feels like a fail.

Lots of room for improvement

Overall, the client performs well in this arena with some room to improve. Obviously it's hard to meet or exceed every one of these conditions.

I think where there were failing results to questions, they could probably be remedied by follow-through all teams to get them moving along and not be stuck in an intermediary state which seemed to be a common theme for most of the answers, unfortunately.