Site Reliability Engineer - Databases (Remote)
London, England, United Kingdom
2d ago

Do you want to be a Site Reliability Engineer that builds and manages scalable, self-healing, globally distributed systems?

Our Site Reliability Engineers make sure users are always connected to great local businesses by keeping Yelp fast and available as we continue to scale.

No matter how many times we get searched, scraped, scanned, spammed, pinged, paged, or queried, we gotta keep our cool and keep the site and the apps running smoothly.

We work for both the Yelp end users and the Yelp developers, implementing critical parts of the core architecture and supporting developers as they do the same.

We get to take on exciting challenges that you can only find at the kind of scale that serves over 100 million users per month.

Spinning up infrastructure should always be a git commit and a code review away : automation and self-service are at the core of what we do.

We're looking for people with a passion for all things related to distributed systems, serving queries fast, uptime, scaling, and solving hard problems with the right tools.

We have fun working on these challenges and are looking for others who do, too!

Where You Come In :

  • Work closely with developers in supporting new features and services
  • Analyze solutions and implement best practices for our database cluster and its components
  • Build cluster management tooling for Cassandra Kubernetes Operator
  • Develop and maintain easy, intuitive API (REST / GraphQL) interfaces to our databases that keep developers moving fast
  • Work on observability of relevant database metrics and troubleshoot site issues using industry-leading tools like Splunk and prometheus
  • Support and administer Cassandra clusters, as well as the stacks they run on by automation
  • Design new systems, tests, and procedures
  • Participate in our daytime on-call rotation, acting as a point of call for automated systems and highlighting availability issues when they can't be automatically resolved
  • What it Takes to Succeed :

  • An experienced software engineer with a strong interest in distributed systems and database technologies (like Cassandra or any other NoSQL databases)
  • Fluency in Python, Java, Golang, or a similar language familiarity with more than one is a plus
  • Knowledge of best practices related to security, performance, high availability and disaster recovery
  • Proficiency in Kubernetes
  • Mastery of Linux
  • Expertise in Configuration Management (i.e., Puppet / Ansible / Chef / etc)
  • Experience with public cloud platforms and related tooling (i.e., Terraform, AWS CloudFormation, etc)
  • What You'll Get :

  • Full responsibility for projects from day one, an awesome team, and a dynamic work environment
  • Competitive salary with equity in the company, a pension scheme, and an optional employee stock purchase program
  • 25 days paid holiday initially, rising to 29 with service and a 1 day floating holiday every year
  • Private health insurance, including dental and vision
  • Flexible working hours and meeting-free Wednesdays
  • Regular 3-day Hackathons and weekly learning groups, always with interesting topics
  • £60 per month toward any exercise of your choice
  • Quarterly offsites
  • Yelp values diversity. We’re proud to be an equal opportunity employer and consider qualified applicants without regard to Age, Disability, Gender Reassignment, Marriage or Civil Partnership, Pregnancy and Maternity, Race, Religion or Belief, Sex.

    Report this job

    Thank you for reporting this job!

    Your feedback will help us improve the quality of our services.

    My Email
    By clicking on "Continue", I give neuvoo consent to process my data and to send me email alerts, as detailed in neuvoo's Privacy Policy . I may withdraw my consent or unsubscribe at any time.
    Application form