Optimizing CircleCI Workflows to Cut Costs by 52% on a Rust Project

Challenges

Although the Peppol Connect team may not be the largest within Money Forward in terms of personnel, it ranks among the top five repositories in terms of CircleCI credits utilized. This disproportionate use of compute resources for continuous integration by a small team of few people, in comparison to other teams, presented an intriguing challenge. It became evident that the Peppol Connect team’s commit frequency did not justify the high resource consumption. Consequently, I was assigned the responsibility of investigating this anomalous behavior to determine its root cause and propose potential solutions.

First discovery

As Peppol Connect was developed using Rust, it necessitated the use of a large machine executor type to compile and link all components into a single deployable artifact. A crucial observation during the investigation was that a significant portion of CI job time was spent downloading compilation cache, leading to inefficient use of compute resources. To minimize the compilation time for Peppol Connect, we maintained a compilation cache of the build artifact. However, over time, the cache’s size increased to the point where the time required to retrieve the cache surpassed the time needed to build without the cache. Consequently, the initial course of action was to focus on reducing the cache size to improve overall efficiency.

Cache restoring step took most of the workflow execution time, as shown above.

First attempt

After doing some benchmarks, we got the following results:

	Cache size	Download time	Compilation time	Total workflow duration
Unoptimized	16GB	roughly 10-12 minutes	50 seconds	16 minutes

During our investigation, we observed that incremental builds were enabled for certain jobs. Incremental building is a feature of rustc that can decrease the time required to rebuild a project by incorporating additional information into the compiled artifact. However, in our case, this feature was not beneficial as it led to increased disk storage usage, which contributed to a larger cache size. Ultimately, the drawbacks of enabling incremental builds outweighed the potential benefits.

workflow:
  environment:
    CARGO_INCREMENTAL: 0

Upon disabling the incremental build feature, we observed the following results:

	Cache size	Download time	Compilation time	Total workflow duration
Unoptimized	16GB	roughly 10-12 minutes	50 seconds	16 minutes
Incremental build disabled	12GB	roughly 8 minutes	50 seconds	12 minutes

A better attempt

To further minimize the cache size, we considered the approach of caching only the project dependencies, similar to the common practice in Node.js projects. However, unlike in Node.js, there is no built-in mechanism in Cargo, Rust’s package manager, to exclusively build and cache project dependencies. We discovered ongoing discussions regarding this issue within the Cargo project, most notably this one, but no significant progress has been made yet. As an alternative, we decided to use a crate called cargo-chef, specifically designed for caching dependencies. Implementing this solution required modifications to our pipeline:

Currently, all jobs download the cache as a single step, causing the cache download time to increase proportionally with the number of jobs. To address this, we created a separate job dedicated to creating or downloading the cache using cargo-chef. The cache was then shared between jobs via the workspace.
Different job types utilized different caches. For instance, end-to-end (e2e) tests required the ability to spin up multiple containers to perform tests, necessitating the use of a Virtual Machine. Other jobs were run inside Docker containers. To streamline cache usage, we modified the e2e tests to employ a Docker-in-Docker approach, allowing all jobs to share a single cache set.

With all the pieces in places, this is what we got:

	Cache size	Download time	Compilation time	Total workflow duration
Unoptimized	16GB	roughly 10-12 minutes	50 seconds	16 minutes
Incremental build disabled	12GB	roughly 8 minutes	50 seconds	12 minutes
Dependencies cached	0.5GB	10 seconds	around 2 minutes	8 minutes

Problems

A concern arose when my mentor, @TakO8ki, highlighted that cargo-chef is not well-maintained. At the time of the decision, the project repository’s most recent commit was approximately five months old, making it unsuitable for a production environment. Additionally, employing a Docker-in-Docker approach for running e2e tests was not considered ideal, as it introduced complexity to the local development setup. Considering these factors, we decided to explore alternative solutions that would be more reliable, efficient, and better aligned with our development requirements.

A simpler attempt

Drawing inspiration from rust-cache, we shifted our focus from building a cache for dependencies only to removing heavy objects from the cache before saving it. Upon investigating the cache content, we discovered that the artifacts produced by kafka-dependent crates were excessively large. To address this issue, we decided to delete all artifacts that depended on kafka before saving the cache. This approach resulted in a significant reduction in cache size, ultimately improving efficiency and performance.

The following snippet demonstrates how we remove kafka-dependent caches:

clean_rust_target:
  parameters:
    path:
      type: string
  steps:
    - run:
        name: Clean target
        command: |
          for package in $(cargo metadata --manifest-path << parameters.path >>/Cargo.toml --all-features --format-version 1 | jq -r '.packages[] | select(.name == "<crate-name>") | .dependencies | map(.name) | .[]'); do
            rm -rf << parameters.path >>/target/debug/build/$package-*
          done

	Cache size	Download time	Compilation time	Total workflow duration
Unoptimized	16GB	roughly 10-12 minutes	50 seconds	16 minutes
Incremental build disabled	12GB	roughly 8 minutes	50 seconds	12 minutes
Dependencies cached	0.5GB	10 seconds	around 2 minutes	8 minutes
Kafka removed before save	1.2GB	30 seconds	around 1 minutes	5 minutes

Evaluation

After a month of implementing the optimization, we conducted a benchmark assessment to evaluate its impact on CircleCI credit usage. The following results were obtained:

Period	Jan 29 – Feb 28	Dec 29 2023 – Jan 28
Storage-months	31.2 GB	109.2GB
Compute minutes	11,177	20,146
Compute credits	437,714	965,185
DLC credits	206,400	373,200
DLC + compute credits	644,114	1,338,385

The data presented in the table above clearly demonstrates that the optimization achieved a significant reduction in CircleCI credit utilization, cutting the monthly usage by approximately half.

Conclusion

Occasionally, the most effective solution does not have to be the most intricate one. In this particular case, simply removing crates dependent on kafka resulted in a significant reduction in credit usage, eliminating the need for more complex setups such as cache sharing or Docker-in-Docker. While these advanced configurations might prove useful in the future, the current situation demonstrates that a simple, straightforward approach can yield substantial benefits. When optimizing, it is essential to consider developer hours as a valuable resource. Thus, an effective optimization strategy should also prioritize simplicity and efficiency in order to make the best use of the development team’s time and effort.