Senior Systems Administrator
Job Description:
Our client is a dynamic and growing SaaS start-up, building a research data management platform specializing in large scale studies operating around the world.
They live to innovate, and empower scientists to focus on the science, not the technology, leading to a faster time to science, and cure.
About the Role:
This is a true Senior System Admin role, focused on system reliability, monitoring, security, cost optimization, and operational excellence, not pipelines or developer tooling.
You will manage multiple environments across multiple customers, ensuring stability, scalability, and compliance. You will also collaborate with external institutions and internal engineering teams to support new experimental environments and services for future projects and products.
The right candidate combines deep technical expertise with a strong process mindset and a proven track record of driving cost, process, and service optimizations in cloud environments.
Key Responsibilities
Environment & Customer Management
-
Manage and enforce discipline across multiple environments (development, staging, production).
-
Oversee infrastructure for multiple customers with unique configurations and compliance requirements.
-
Maintain consistency and prevent configuration drift across environments.
-
Plan and execute patches, upgrades, and rollouts with minimal disruption.
-
Enforce strict rules of engagement for access, updates, and incident response.
Cloud & Systems Administration
-
Administer AWS infrastructure, including: EKS, EC2, RDS Postgres, S3, Route 53, IAM, Secrets Manager, GuardDuty, Config, Inspector, Security Hub, WAF, etc.
-
Support expansion into GCP, including compute, IAM, and networking services.
-
Perform Linux systems administration: patching, hardening, troubleshooting, performance tuning.
-
Manage Kubernetes clusters (EKS today, GKE in future), including upgrades and scaling.
-
Administer and tune Postgres (Aurora RDS) databases.
Monitoring, Cost & Optimization
-
Knowledge of DataDog across metrics, logging, APM.
-
Build dashboards, alerts, and reports for proactive observability.
-
Integrate monitoring alerts with AWS/GCP services.
Collaboration & Experimentation
-
Work with partner institutions’ sysadmins to support joint projects and integrations.
-
Support pilot projects, proof-of-concept systems, and future product initiatives.
-
Serve as a technical bridge between infrastructure operations and R&D.
Process, Documentation & Audit Readiness
-
Write and maintain Methods of Procedure (MOPs) and Standard Operating Procedures (SOPs).
-
Maintain detailed documentation in Confluence to ensure audit readiness (ISO and other certifications).
-
Use Jira to manage SysAdmin backlogs, patching cycles, and infrastructure updates.
-
Champion a documentation-first, process-driven approach to system administration.
Security & Compliance
-
Administer CrowdStrike for endpoint and workload protection.
-
Manage vulnerabilities end-to-end: identification, patching, validation, documentation.
-
Apply best practices for IAM, secrets management, encryption, and network segmentation.
-
Support ISO, FedRAMP, and SOC2 compliance efforts through disciplined system operations.
Leadership & Forward Planning
-
Evaluate and operationalize new services to enhance scalability, reliability, and security.
-
Balance stability with innovation, ensuring infrastructure is ready for future growth.
Required Skills & Experience
-
7+ years in System Administration roles with strong cloud operations experience
-
Strong Kubernetes administration (EKS required, GKE nice to have)
-
Strong Postgres (Aurora RDS) administration experience
-
Deep Linux administration expertise (patching, troubleshooting, hardening)
-
Proven track record managing multi-environment, multi-customer infrastructures
-
Experience with configuration management tools (e.g., Ansible, Puppet, or equivalent)
-
Strong process orientation with MOP/SOP writing experience
-
Comfortable using Jira, Confluence, Bitbucket to plan, track, and document
-
Detail-oriented, disciplined, and meticulous about documentation
-
Drive a culture of operational discipline, cost-awareness, and continuous improvement
-
Experience with Keycloak
Nice-to-Haves
-
Certifications such as AWS SysOps Administrator or Certified Kubernetes Administrator
-
Familiarity with Active Directory and other Microsoft services
-
Knowledge of secure compute data processing techniques including SLURM and Open On Demand
-
Experience in regulated environments (ISO, FedRAMP, SOC2, HIPAA, NIST 800-171)
-
GCP compute, IAM, networking experience
-
Infrastructure-as-Code (Terraform, CloudFormation)
-
Experience with Globus, including Globus Auth, Globus Share, and Globus Compute
-
Datadog knowledge (dashboards, monitors, logs, APM, synthetic testing, reporting)
-
Experience collaborating with external institutions/partners and internal engineering teams
-
Familiarity with CrowdStrike or similar endpoint security platforms
What we offer
-
Competitive salary and benefits package
-
Hybrid and flexible working conditions
-
Opportunities for growth and leadership
-
Supportive and structured team environment
-
Access to tools and resources for ongoing professional development