Site Reliability Engineer // Okta

October 25, 2019

Position Description:

At Okta our motto is “Always On”, and nowhere do we embrace that more than in Technical Operations. We strive to build the most reliable and performant systems on the planet through the skillful use of automation. If you like to be challenged and have a passion for solving problems at scale with automation, testing and tuning then we would love to hear from you. The ideal candidate is someone who exemplifies the ethics of, “If you have to do something more than once, automate it,” and who can rapidly self-educate on new concepts and tools.

You will work on:

  • Designing, building, running and monitoring Okta’s production infrastructure
  • Responding to production incidents and determining how we can prevent them in the future
  • Triaging and troubleshooting complex production issues to ensure reliability and performance
  • Identifying and automating manual processes
  • Continuously evolving our monitoring tools and platform
  • Promoting and applying best practices for building scalable and reliable services across engineering
  • Developing and maintaining technical documentation, runbooks, and procedures
  • Supporting a 24×7 online environment as part of an on-call rotation

You are an ideal candidate if you:

  • Have experience automating and running large scale production Java/Tomcat services in AWS (EC2, ECS, KMS, Kinesis, RDS) or other cloud providers
  • Are able to code to a good standard with any programming language, but especially Ruby or Python, using source control and Agile methodologies
  • Have experience writing infrastructure as code using tools such as Chef and Terraform
  • Deep understanding of MySQL including replication and clustering strategies.
  • Knowledge of NoSQL cluster data stores such as DynamoDB, Redis, Cassandra or Elasticsearch
  • Experience using and supporting log and telemetry aggregation services such as Splunk and Wavefront
  • Solid understanding of CI/CD principles, Linux fundamentals, networking concepts and IP protocols
  • Scripting skills for operational tooling in Bash, Ruby, Python, Go or similar
  • Experience running container technology in production

Education and Training:

  • B.S. Computer Science (plus) or relevant experience

Okta is an equal opportunity employer 

Visit the company website