This position, on the Platform Operations team, is responsible for operating application, compute, and storage infrastructure. A successful candidate for this role will have a strong production mindset and mature judgement to complement their technical skills. This individual will enable teams of developers, SREs, and Platform Engineers to utilize, design, implement, and maintain Zoox' on-prem infrastructure. This individual will need the capability to thrive in a rapid paced and frequently changing environment.
Responsibilities
Administrate Linux systems and applications at expert level (Ubuntu/Debian required)
Expert level skills in Linux concepts, administration, and troubleshooting
Deploy new systems in a rapidly expanding environment
Document new and existing processes and architectures
Support hardware install and advanced troubleshooting activities
Participate in IP management, DNS administration, and hardening efforts
Participate in maturing our patching, upgrading and maintenance activities
Participate in maintaining monitoring systems and participate in creating new monitoring solutions
Experience with enterprise-level production environments required
Experience with PXE booting and network imaging/installs
Experience with GPU compute is a plus
Participate in a 24x7 on call rotation
Qualifications
Minimum of 5 years of Linux system administration experience (Ubuntu experience strongly preferred)
Demonstrable Linux knowledge at expert level
Experience with system deployment tools (PXE, Cobbler, Digital Rebar, Foreman, etc.)
Commensurate knowledge of x64-based hardware and storage
Extensive networking knowledge including layers 1-3, IPv4, VLANs, etc
Experience with or exposure to Linux GPU compute systems desired (multi-GPU, CUDA)
Scripting abilities (Bash, Python, etc.)
Experience with or exposure to orchestration: Ansible/Salt (preferred), Chef or Puppet
Bonus Qualifications
Docker container experience
Kubernetes experience
Experience with VMWare solutions
Experience with Windows Server management
Experience with Mellanox and Arista networking equipment