Systems Reliability Engineer - Americas
Posted Sep 27
Headquarters: London, UK
At Canonical it is our mission to make open source software available to people everywhere. We believe the best way to fuel innovation is to give the innovators the technology they need. As a Systems Reliability Engineer (SRE) for the Information Services (IS) team you'll play a key role in driving this mission and helping to define the future of free software.
SREs work closely with development teams to build and maintain the extraordinary infrastructure required to run all of Canonical and Ubuntu’s systems and services. The scope of our responsibility combined with the overall size of our environment means that our SREs face new challenges every day. From developing automated processes for faster, more reliable deployments to building large and scalable cloud environments, every day at Canonical is an opportunity to learn something new and collaborate with some of the most talented technical minds in the industry.
IS supports and maintains all of Canonical’s production services and IS team members use real-life operational experiences to contribute to product improvements. As an SRE you’ll be in a unique position that will allow you to provide critical feedback to developers by writing code, submitting bugs, and working with others within the company to ensure that Canonical products are as good as they can be. You will also be able to develop and submit fixes and enhancements directly.
Key Responsibilities & Accountabilities
SREs rotate through three roles:
- Maintaining all core services, networks, and infrastructure (including public and private clouds). The ability to work under pressure and demonstrate sound problem solving skills in a fast-paced and complex environment are key here.
- Working directly with a variety of development teams within Canonical in a devops role to test, deploy, monitor and maintain services running on our production clouds. This will require an overlap of development and administration skills, as you help write and review code you will then use to deploy and maintain services using Canonical's cloud products.
- Larger project work, currently focused on large scale cloud deployments and overall process improvements. This role gives SREs the ability to utilize development and architecting skills in a focused manner that is unique to Canonical.
Required Skills & Experience
- You have prior experience working in a large highly available environment
- You are willing to be flexible and adaptable with the ability to learn new things quickly.
- You have strong development skills (Python, Go, Ruby, etc.) with experience writing code.
- You are heavily focused on automation preferably with experience in building and maintaining self-service tools.
- You have authoritative understanding and experience with the administration of infrastructure services such as DNS, DHCP, SSH, Apache/Nginx, HAProxy, Squid/Varnish, PostgreSQL/MySQL etc.
- You have practical knowledge of IP networking and routing
- You have a strong security focus including knowledge of network, operating system and application level practices
- You have familiarity with software development and code review practices, including use of DVCS (e.g. git or bzr)
- You have experience deploying, administering and maintaining services in a cloud computing environment
- You are able to communicate clearly in English, especially using email and IRC
- You have a college degree in a relevant technical field or equivalent experience.
- You are are self-driven and able to troubleshoot, ask others when appropriate and find answers
- You are motivated, organised, and willing and able to work well remotely within a distributed team
- You are able to participate in our weekend on call rotation approximately 1 weekend every 12 weeks
Desired Skills & Experience
- You have prior experience administering OpenStack
- You have familiarity with Juju and MAAS
- You have familiarity with Ubuntu or Debian
- You have prior experience with configuration management tools
(Puppet, Chef, CFEngine, etc.)
- You have prior experience maintaining and configuring routers and firewalls (Cisco, iptables)
Canonical is an equal opportunity employer.