Is the Senior Manager System Engineering position remote, hybrid, or on-site?

The Senior Manager System Engineering role at GoDaddy is a hybrid opportunity. The location specified by the employer is Remote Worldwide.

What is the salary range for the Senior Manager System Engineering role at GoDaddy?

The salary for Senior Manager System Engineering at GoDaddy is not explicitly stated, but is competitive and based on experience.

How do I apply for the Senior Manager System Engineering position?

You can apply directly by visiting the dynamic application link on FutureTalent at: https://www.futuretalent.online/jobs/3649-senior-manager-system-engineering-godaddy.

JOB 8

About

At GoDaddy, the future of work looks different for each team. Some teams work in the office full-time; others have a hybrid arrangement (they work remotely some days and in the office some days), and some work entirely remotely.

This is a remote position, so you’ll be working remotely from your home. You may occasionally visit a GoDaddy office to meet with your team for events or meetings.

Join GoDaddy's Forge Ops team at the intersection of Data, Infrastructure, and AI-driven operations. As Senior Manager, Systems Engineering, you will lead the reliability, cost efficiency, and agentic operation of the Data & AI ecosystem that serves GoDaddy. This is a deeply technical leadership role, not a hands-off manager position. You will operate as GoDaddy’s L1/L2 authority over critical analytics and data platforms while advancing Forge Operations: a structured operating model designed to transition platform operations from hero-based, expert-dependent support to system-based, agent-assisted, self-improving operations. If you can translate a business problem into a technical architecture and that architecture into team execution — and you want to build the AI Ops pattern for a large-scale data organization, this role is for you.

Responsibilities

Own and operate GoDaddy’s analytical and data intelligence platforms (Redshift, QuickSight, FeedDB, Protegrity, Alation) as the authoritative L1/L2 platform owner — driving reliability, deployment standards, cost optimization, and user enablement across an ecosystem with a 50PB+ data lake and thousands of consumers.
Lead 24/7 incident management and production operations across 10+ Data & AI platforms, owning MTTR/MTTD targets, AAR rigor, and a root-cause-to-control loop that converts every incident into a runbook, monitoring improvement, or automation — not just a resolved ticket.
Architect and advance Forge Ops OS, the team’s agent-based operating model. This model uses history-informed early warning, auto-recovery agents, runbook intelligence, and bounded agentic orchestration. The team transitions from operating systems to leading all aspects of agents that operate systems.
Drive data platform cost efficiency through unit economics— cost per query, cost per workload, cost per dashboard visit — translating AWS spend into measurable business metrics and continuous optimization across Redshift, QuickSight, DPaaS, and ML infrastructure.
Manage operational planning and executive reporting weekly, monthly, and quarterly. Run a sprint-based improvement program with a near 70% strategic allocation. Provide clear traceability from team execution to company goals and landmark outcomes.

Requirements

5+ years validated 24/7 production operations leadership— leading incident response end-to-end, owning MTTR performance, leading post-mortems (AARs) that produce controls, and driving the systemic fixes that reduce incident recurrence
Hands-on AWS architecture/platform expertise — Redshift, EMR/Airflow, Lambda, EKS, S3, IAM/RBAC, and CDK/CloudFormation — with end-to-end operational and cost ownership of at least two production data or analytics platforms.
Systems and software architecture fluency— able to translate business requirements into scalable technical designs, reason about architectural trade-offs, and decompose solutions into actionable engineering tasks without deferring all technical judgment to individual contributors.
Data platform operations at scale— ETL/ELT pipelines, data lakes, orchestration frameworks (Airflow, EMR), and BI tooling — with deep understanding of data quality, SLAs, lineage, and the dependency chains that connect producers to executive-facing consumers.
Technical team leadership with operational rigor— proven ability to lead engineers through sprint-based planning, capacity management, and cross-functional delivery, while maintaining the hands-on technical credibility to unblock, review, and elevate the team’s output.
Experience with AI/agentic operations — building or operating LLM-based tools such as automated runbooks, incident response agents, AAR generation systems, or bounded auto-recovery workflows.
Familiarity with graph databases or lineage/observability architectures (e.g., Neptune or equivalent) for dependency mapping, early warning, and blast-radius analysis in large data ecosystems.
Hands-on experience with Databricks or analytical compute platforms (Lakehouse, feature stores, ML infrastructure) in a production operations context.
Experience with data protection platforms (e.g., Protegrity) and PII/tokenization workflows in large-scale data lake or analytics environments.
Familiarity with ServiceNow/CMDB or equivalent incident management systems (Jira, PagerDuty) as operational systems of record — including MTTR/MTTD tracking and CI/lineage integration.

Senior Manager System Engineering

JOB 8

About

Responsibilities

Requirements

Apply Now

Share Job

Job Overview

FAQ

Is this position remote?

What is the salary?

How do I apply?

Similar Opportunities

Veterinary Assistant

Werkstudent (m/w/d) Power BI & Power Apps

Litigation & Appraisal Adjuster (Remote, US)