Post

🐛 Poetry v1.3 broke my pipeline

poetry install abruptly exited with Error code 1 in Buildkite but you cannot reproduce it on your local machine?! It took my team around half day to track down the root cause and I hope this post can save you some time here.

For your information, we use docker to build our images for the production environment and all tests are executed inside the docker container in Buildkite. This gives us the confidence if we have an issue in our CI/CD pipeline, we can reproduce it locally.

Apparently, we take this too far, variety of base OS (Linux for Buildkite, macOS for local) plus other hardware disparity could easily put us off. And this time, it’s TTY.

Normally when you run a test locally, we tend to use an interactive shell where as in the pipeline it is normally discouraged for sake of performance and cost. It’s useless as well given it’s uncommon for developers to connect to a build machine and give the build extra input. However, this difference can mask issues that can be exposed easily and earlier.

We initially observed an inconsistent build outcome. Due to unpinned Poetry version. Everything is fine before Poetry v1.22. But once the docker cached layer expired and the latest Poetry kicked in, you’ll see a broken poetry install. It’s so sudden that even if you turn on -- verbose, there’s no extra insight you’ll get around the stack trace.

This issue has been recorded here. Essentially, the stdlib method used in cleo acts differently in the newer version and it’s sneaked into Poetry v1.3 without being caught by tests. Who will come up with a test case like that?

Anyway, if you meet the same issue and want to avoid this issue in your CI/CD pipeline, please

  • use poetry --quiet or poetry --no-ansi if you still want this Poetry version
  • pin your Poetry to version v1.2.2
This post is licensed under CC BY 4.0 by the author.