This position, on the Platform Operations team, is responsible for operating application, compute, and storage infrastructure. A successful candidate for this role will have a strong production mindset and mature judgement to complement their technical skills. This individual will enable  teams of developers, SREs, and Platform Engineers to utilize, design, implement, and maintain Zoox' on-prem infrastructure.   This individual will need the capability to thrive in a rapid paced and frequently changing environment.

Responsibilities

  • Administrate Linux systems and applications at expert level (Ubuntu/Debian required)
  • Expert level skills in Linux concepts, administration, and troubleshooting
  • Deploy new systems in a rapidly expanding environment
  • Document new and existing processes and architectures
  • Support hardware install and advanced troubleshooting activities
  • Participate in IP management, DNS administration, and hardening efforts
  • Participate in maturing our patching, upgrading and maintenance activities
  • Participate in maintaining monitoring systems and participate in creating new monitoring solutions
  • Experience with enterprise-level production environments required
  • Experience with PXE booting and network imaging/installs
  • Experience with GPU compute is a plus
  • Participate in a 24x7 on call rotation

Qualifications

  • Minimum of 5 years of Linux system administration experience (Ubuntu experience strongly preferred)
  • Demonstrable Linux knowledge at expert level
  • Experience with system deployment tools (PXE, Cobbler, Digital Rebar, Foreman, etc.)
  • Commensurate knowledge of x64-based hardware and storage
  • Extensive networking knowledge including layers 1-3, IPv4, VLANs, etc
  • Experience with or exposure to Linux GPU compute systems desired (multi-GPU, CUDA)
  • Scripting abilities (Bash, Python, etc.)
  • Experience with or exposure to orchestration: Ansible/Salt (preferred), Chef or Puppet

Bonus Qualifications

  • Docker container experience
  • Kubernetes experience
  • Experience with VMWare solutions
  • Experience with Windows Server management
  • Experience with Mellanox and Arista networking equipment