Site Reliability Engineer – Container Platform

Criteo -75009 Paris 9e


Who we areAt Criteo, our culture is as unique as it is diverse. With offices around the world, our incredible team of 2,700+ Criteos collaborates to create an open & inclusive environment. We work together to achieve our goals, push boundaries, and be impactful. All of this supports us in our mission to power the world’s marketers with trusted & impactful advertising.
About SRE:
Most of all, we are creators. From designing ground-breaking products to finding unique ways to solve technical challenges at an exceptional scale, our tech teams work with state of the art methodologies to shape the future of advertising. The Site Reliability Engineering teams keep one of the largest computing platforms in the AdTech world functioning like clockwork. They keep our products running using a broad selection of technologies, like large scale data compute & storage services (Hadoop, SQL & NoSQL), streaming (Kafka), platform as a service (Chef, Mesos), identity management (Kerberos) and analytics (Hive, Druid, Vertica), as well as an extensive monitoring/observability infrastructure.
The Container-Platform team builds and operates the current and next gen platform to run product services on our world-wide Criteo infrastructure. We provide platform as-a-service for all stateless and stateful applications in Criteo (think webservers, databases, distributed file systems). We use Apache Mesos & Kubernetes to achieve that vision.We spend time to understand our client needs and help them launch thousands of instances of their apps across the world, isolated using containers and connected through our internal service mesh network.
You will be in charge of building and operating our 10k servers clusters across 8 datacenters around the world.You will have to imagine and implement mechanisms to let users forget about infrastructure and focus on building their products.

Your day to day tasks:

  • improve isolation between containers to allow colocation of intensive tasks on the same servers (solving “the noisy neighbor problem”)
  • fight against toil and time-consuming tasks to automate ourselves out of the job
  • interview our users to understand what they really need and how we can help them focus on their own mission
  • design the automation of server maintenance with minimal disruption on user services

You will be part of a strategic team within the R&D organization and advise our 600 internal clients. You will also help us to build a platform that abstracts away Kubernetes and Mesos for our users.

You’ll also help us to maintain our open source projects/forks (

What we expect:
Obviously, you know your tech, you have a willingness to learn new technologies, find solutions for “impossible” problems, etc.You are also a good communicator. You will be talking with our users to find out what they need, and you will be writing documentation to explain how things work for both ourselves and our internal users. The operating language at Criteo is English.
Our stack :We are not looking for experts in specific technology of course, but here is what we currently use the most:Apache Mesos, Marathon and KubernetesA bit of Python, Ruby, Scala and C++Chef for server provisioning
At Criteo, we are committed to creating an environment where all Criteos feel a sense of belonging. We nourish our diversity by listening to all cultures within Criteo – and there are many. We are proud to be a global team and conscious that it takes people with different perspectives, thoughts and cultures to succeed.
Criteo collects your personal data for the purposes of managing Criteo’s recruitment related activities. Consequently, Criteo may use your personal data in relation to the evaluation and selection of applicants. Your information will be accessible to the different Criteo entities across the world. By clicking the “Apply” button you expressly give your consent.