Big Data isn’t a problem. It’s an opportunity.
At Alation, we help people find, understand, and trust data. So they not only excel in their work — they drive value for their enterprise, team, and role. In the words of one customer, “Alation makes me look like a rockstar.”
We help companies you know and trust empower their people with the best data every day. Alation helps Discover Financial Services quickly generate value from their data to create the product and customer service innovations that help the iconic credit card company remain number one in customer satisfaction. And real estate giant Keller Williams uses Alation to govern the more than 70 TB of data that empowers their global team of over 190,000 agents.
With $340M in funding – valued at over $1.7 billion and 550+ customers, including 35% of Fortune 100 companies- Alation is poised to capitalize on data as an opportunity. Headquartered in Silicon Valley, Alation was named to Inc. Magazine’s Best Workplaces list for the fourth time. Do you want to join a team that welcomes new ideas, supports your growth, and recognizes your unique value?
Join us!
Job Description:
The SRE Team manages world-class Alation cloud infrastructure and ensures our state-of-the-art services' reliability, availability, and performance.
We seek a Senior Site Reliability Engineer (SRE) to manage and enhance the configuration, stability, performance, and network connectivity of our Commercial and FedRAMP cloud offering.
What you’ll do:
Manage and run backend systems like Kubernetes, RDBMS, AWS services, and everything in between
Work closely with internal partners and teams to ensure that we ship software that meets security, SLA, and performance requirements
Maintain the Alation platform by diagnosing, predicting and correcting scaling problems
Participate in on-call rotation, identify issues, drive them to resolution while conducting blameless RCA
Write deployment plans, execute upgrades, develop documentation, capacity plans, and troubleshoot production issues
Coach and mentor junior-level engineers and contractors
Ensure Operational and ITIL best practices are documented and followed
Must have:
Must be a US Citizen to be considered for this position and on US soil
BS/MS in Computer Science or equivalent and at least 5+ years of experience in Technical Operations production SaaS roles
Strong working knowledge of Kubernetes & Docker
Experience with any higher language like Python, Ruby, Go or Java
Experience with a SaaS Product at Scale
Strong understanding of DevOps/Agile Principles
Good note-keeping (Confluence) and ticket management skills (Confluence, JIRA)
Strong experience with Linux/Unix Systems and cloud providers like AWS/Azure/GCP
Experience working with Infrastructure as Code (IaC) at scale using tools like Terraform, Chef, Ansible.
Good understanding of networking and messaging between services
Curiosity, growth mindset, and always willing to learn new technologies
Good communication and strong interpersonal skills
Experience with monitoring and alerting tools ( eg. Prometheus/ Grafana & Datadog)
Experience with logging tools (e.g. Datadog, ELK, Loki)
Proven experience leading complex projects
Nice to have:
Experience on SCM tools like (Git, Github)
Experience with SQL and NoSQL databases
Ability to work across multiple teams and good knowledge on AWS cloud services (e.g. EKS,RDS,IAM,Lambda etc.)
Prior FedRAMP experience
Security And Privacy Responsibilities :
This position carries special Security and Privacy Responsibilities for protecting the U.S. Federal Government’s interests:
Know, acknowledge, and follow system-specific security policies and procedures
Protect data and individual privacy per requirements and regulations
Perform ongoing activities in compliance with service and contractual obligations
Participate in role-based training, completing assignments on a timely basis
Report security issues promptly and aid investigation when needed
Support controlled changes and vulnerability remediation activities
Work collaboratively with Information Security in designing, implementing, assessing or enhancing system-specific security and privacy controls
#LI-RB1
#LI-Remote
Compensation Pay Range:
$191,481.00 - $220,000.00
Salary Information
The base salary range is specific to the United States. The salary of the final candidate selected for this role will be set based on a variety of factors, including but not limited to internal equity, experience, education, work location, specialty and training. If the final candidate has a different level of experience, the base salary target range may be lower or higher than what is published.
Alation, Inc. is an Equal Employment Opportunity employer. All qualified applicants will receive consideration for employment without regards to that individual’s race, color, religion or creed, national origin or ancestry, sex (including pregnancy), sexual orientation, gender identity, age, physical or mental disability, veteran status, genetic information, ethnicity, citizenship, or any other characteristic protected by law.
The Company will strive to provide reasonable accommodations to permit qualified applicants who have a need for an accommodation to participate in the hiring process (e.g., accommodations for a job interview) if so requested.
This company participates in E-Verify. Click on any of the links below to view or print the full poster. E-Verify and Right to Work.