Site Reliability Engineer
Fairfax, VA, USA
Posted on Saturday, June 17, 2023
Spruce builds digital versions of identification, like mobile driver’s licenses, that offer more security, privacy protection, and control for individuals across their digital interactions. Credible is our product fostering three pillars of technology: privacy-enhancing, interoperable, and open source.
The site reliability engineer will serve as a bridge between development and customer operations teams, enabling the development team to bring new software or new features to production as quickly as possible, while also ensuring an agreed-upon acceptable level of IT operations performance and error risk in line with customer service level agreements. This individual will actively diagnose and manage deployment issues, as well as develop automation and mitigation systems to improve future deployments. This individual will be focused on customers in the Pacific time zone and their working hours will reflect this focus.
There are no well-worn paths or theoreticals here. Every Spruce technical staffer gets their hands dirty writing code, learning new technologies, and solving problems at the bleeding edge of our space. We hire results-oriented developers who love technology and are committed to intellectual honesty, user privacy, and innovation.
- Experience with and functional knowledge of Kubernetes
- Experience with and functional knowledge of mobile app development and processes
- Demonstrated ability to balance reliability with agility and ability to react quickly
- Ability to operate in a high-ambiguity environment with shifting priorities while building bedrock processes
- Strong technical communication skills with ability to tailor to different audiences
- Ability to work core hours in the Pacific time zone (8am to 5pm) as well as availability for evening deployments
- Experience across multiple organizational sizes and types
- Create standards, best practices, and processes to streamline development and ensure consistency, repeatability, and reliable deliverables
- Contribute reliability, monitoring, and consistency updates to the Spruce codebase balancing security, privacy, and customer needs
- Review PRs for conformance to standards
- Develop automated gates for review and quality assurance that catch errors early and make decisions that ensure system resiliency
- Troubleshoot, identify, analyze, and either remedy or document production errors appropriately
- Establish SLAs along with processes and systems to measure, maintain, and budget for system availability and uptime
- Serve as primary on-call resource and scheduler
- Experience and functional knowledge of Rust
- Experience with HSM management
- Experience with digital identity use cases
- Experience with GitOps best practices
We are passionate about cultivating a thriving culture of diverse individuals who bring unique perspectives to our mission. We are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity or Veteran status.
Spruce offers competitive cash and equity compensation along with an excellent benefits package including:
- Quality group health insurance coverage
- Unlimited PTO
- International team with remote-first policy
- Team gatherings around the world