Site Reliability Engineer
Information Technology - Infrastructure Design and Support
SpaceX was founded under the belief that a future where humanity is out exploring the stars is fundamentally more exciting than one where we are not. Today SpaceX is actively developing the technologies to make this possible, with the ultimate goal of enabling human life on Mars.
SITE RELIABILITY ENGINEER
SpaceX is looking for an experienced engineer with deep knowledge and broad experience across Linux-based technologies, particularly in the Kubernetes and container orchestration space. This employee will be a member of the Starlink Information Technology team focused on supporting the infrastructure that will drive thousands of satellites and provide global internet access.
- Work closely with other SpaceX engineers to build and improve software platforms and related technologies in a world-class environment
- Own responsibility for processes, systems, and tools you create and manage, with a strong focus on continuous improvement and reliable operations
- Advise other engineers on Kubernetes best practices regarding CI/CD workflows, architecture designs, and microservice orchestration principles
- Use quantitative reasoning to determine short- and long-term growth projections and develop strategic plans for the future needs of the Starlink infrastructure
- Identify and eliminate security vulnerabilities in the infrastructure and software stack, and work closely with Information and Product Security teams to evaluate potential problems and solutions
- Eliminate toil and improve automation of rote work
in computer science, engineering, math, physics, information systems, or similar technical discipline + 3 years of professional experience as a systems administrator/engineer; OR at least 5 years of professional experience as a systems administrator/engineer without a degree
Experience managing a moderate to large Kubernetes infrastructure
Experience with Linux servers in virtualized environments
Experience with Python or Bash scripting and automation
PREFERRED SKILLS AND EXPERIENCE:
- Expertise in creating repeatable, reliable, scalable systems architectures, with high availability, fault tolerance, performance tuning, monitoring, and statistics/metrics collection
- Expert working knowledge of service mesh solutions (preferably Istio with Envoy) and SDN/network overlays
- Experience designing and running stateful sets, with associated persistent volumes
- Demonstrated experience with both declarative (preferably Puppet) and imperative (preferably Ansible) configuration management solutions
- Must be comfortable working with mission critical and sensitive systems, with a sense of urgency appropriate to the responsibilities
- Desire and ability to ramp up quickly on new tools and techniques
- Excellent verbal and written communications
- To conform to U.S. Government space technology export regulations, applicant must be a U.S. citizen, lawful permanent resident of the U.S., protected individual as defined by 8 U.S.C. 1324b(a)(3), or eligible to obtain the required authorizations from the U.S. Department of State. Learn more about ITAR here.
SpaceX is an Equal Opportunity Employer; employment with SpaceX is governed on the basis of merit, competence and qualifications and will not be influenced in any manner by race, color, religion, gender, national origin/ethnicity, veteran status, disability status, age, sexual orientation, gender identity, marital status, mental or physical disability or any other legally protected status.
Applicants wishing to view a copy of SpaceX’s Affirmative Action Plan for veterans and individuals with disabilities, or applicants requiring reasonable accommodation to the application/interview process should notify the Human Resources Department at (310) 363-6000.
Job Ident #:
Job and company information not to be copied, shared, scraped, or otherwise disseminated/distributed without explicit consent of JSfirm, LLC