Innovating at Scale – Practices from Within Nexxen Engineering (Part 2)

By: Chris Trader, Senior Software Engineer

January 21, 2025

At Nexxen, the stability of our platform is core to our engineering team’s mission, ensuring that our customers have a seamless experience while we continue to innovate at a fast pace. To achieve this, we rely on our ability to make small, incremental changes, push them to our production systems quickly, and immediately see the impact those changes have on the overall health of our platform. In my previous post, we discussed why and how we test in production. In this article, we’ll dive into our observability platform and our culture of ownership.

Observability-Driven Development

In a highly concurrent, low-latency system like Nexxen’s, validating a change requires us to examine the production environment holistically. This is where our observability platform, Atlas, comes into play.

Atlas is an internally white-labeled, self-hosted Grafana LGTM stack maintained by our infrastructure team. It provides us with real-time visibility into the health and performance of our production systems, enabling us to quickly detect and diagnose issues. With Atlas, every engineer has access to a wealth of telemetry data, including metrics, logs, and traces, which they can use to gain insights into how their changes are affecting the system.

At Nexxen, some of the first questions we ask when developing a new feature or making changes to our system are:

How will we know if this change is working as intended when it’s released to production?
How will we be alerted if the change is not performing as expected?

These questions are at the heart of our observability-driven development approach. By defining clear metrics upfront and ensuring that we have the necessary telemetry in place to track those metrics, we can quickly assess the impact of our changes once they’re deployed to production. This proactive approach to observability helps us catch potential setbacks early to avoid negative impacts on our customers.

Observability-driven development not only helps us identify and resolve issues more efficiently, but also enables us to continuously optimize our systems. By analyzing the telemetry data collected by Atlas, we can identify performance bottlenecks, resource inefficiencies, and opportunities for improvement. We proactively make optimizations and architectural changes that enhance the overall reliability and scalability of our platform.

A Culture of Ownership

Perhaps most importantly, Nexxen has a culture of ownership where every engineer is given the knowledge, tools, responsibility, and trust they need to own their work end-to-end. We all know how our systems work, and nothing is “thrown over the wall” for another team to run or monitor in production.

To support this mindset, we have invested heavily in production-related tooling and practices. Engineers are encouraged to actively engage with production systems daily, as that is where our users interact with our code and infrastructure. We have built robust guardrails and safety nets that enable us to confidently make changes. By fostering a culture of trust, ownership, and continuous improvement, we are able to deliver exceptional value to our customers while maintaining a stable and reliable platform.

Conclusion

At Nexxen, we pride ourselves on our platform’s stability and our ability to continue to improve our technology while we grow. Through realistic testing in production environments, how we track success metrics and analyze our performance data, and fostering a culture of ownership throughout our engineering teams, Nexxen’s platform delivers both innovation and stability.

Read Next

Innovating at Scale – Practices from Within Nexxen Engineering (Part 2)

By: Chris Trader, Senior Software Engineer

Streaming Wars Heat Up: NFLX Earnings

Life at Nexxen with Stephanie Baghai

Nexxen Launches Health Offering to Reach Key Audiences with Accuracy and Scale

© 2025 Nexxen International Ltd. All rights reserved.