← Back to all jobs
14d 9h left to apply
P

Data Center Reliability Engineer

Phaidra🌍 Remote WorldwideEstimated: $80,000 - $120,000

Data Center Reliability Engineer

Company: Phaidra
Location: U.S. Based (Pacific Time Zone preferred), with flexibility for APAC time zone overlap
Type: 100% Remote
Salary: $101,320 - $163,900 (depending on location tier), plus equity.

About Phaidra

Phaidra is revolutionizing industrial automation with AI-powered control systems. We enable industrial facilities to learn and improve automatically using reinforcement learning algorithms, converting sensor data into high-value actions. Our focus is on industrial applications where facilities are well-sensorized and have measurable KPIs, perfect for AI application. We empower domain experts to configure these AI control systems without coding. Our team has a proven track record of applying AI to challenging problems, including work with DeepMind and Google's data centers.

About the Role

As a Data Center Reliability Engineer on the Data Science team, you will act as the bridge between raw infrastructure telemetry and actionable operational intelligence. You possess a deep understanding of mechanical and electrical systems, allowing you to analyze system data like a doctor reads a patient's chart. This hands-on role requires a proactive, ownership-driven mindset and excellent communication skills to drive impact in the critical world of data center uptime. You will be instrumental in teaching our AI models to understand the physical world by identifying "failure signatures" and refining the logic engine of our monitoring tools.

Responsibilities

  • Multidisciplinary Diagnostic Analysis: Analyze sensor data from mechanical (chillers, pumps) and electrical (UPS, switchgear, power feeds) systems to identify "failure signatures" for our LLM-driven monitoring tool.
  • Refining the Logic Engine: Act as a primary user of Phaidra's platforms, identifying gaps and collaborating with Engineering to influence future features and data quality.
  • Operational Insight Generation: Translate raw telemetry into SME-level logic and directions for our LLM tool to guide data center operators in real-time.
  • SME Development: Cultivate deep domain expertise across all facets of data center infrastructure, mastering mechanical and electrical dependencies.
  • Customer Guidance: Support customers by using the platform to provide clear, data-backed direction on complex problems.
  • Model Validation: Oversee pilot projects to test AI-driven SME tool interpretations of real-world stressors, ensuring operational realism, accuracy, and actionability.
  • Adaptability: Proactively solve challenges and scopes not explicitly defined in a fast-moving team environment.

Key Qualifications

  • Experience: 2-3 years of professional relevant experience.
  • Education: Bachelor’s degree in Mechanical Engineering, Electrical Engineering, Control Theory, or a related field.
  • Analytical Grit: Deep interest in using data to diagnose system failures; a "tinkerer" who prefers solving real-world problems.
  • Technical Proficiency: Strong Python skills and experience with data manipulation libraries (Pandas/NumPy).
  • Communication Mastery: Ability to clearly and persuasively explain complex diagnostic findings to technical peers and non-domain stakeholders.
  • Unbiased Problem-Solving: Proven ability to approach problems without preconceived notions and find solutions independently or collaboratively.
  • Alignment with Values: Demonstrated commitment to Transparency, Collaboration, and Ownership.

Preferred Skills & Experience

  • Experience with critical infrastructure components (HVAC, power distribution, industrial automation).
  • Experience with time-series data from industrial sensors (SCADA, BMS, Smart Meters).
  • Curiosity or experience with how LLMs can be used for root-cause analysis and automated reporting.

Onboarding & Interview Process

  • First 30 Days: Familiarize with company/team roadmap, review system ontologies and sensor data, shadow senior team members.
  • First 60 Days: Build proficiency in internal data tools, identify failure signatures, propose tooling solutions.
  • First 90 Days: Provide direct customer guidance, contribute to LLM "instruction set" refinement, present post-incident analysis.
  • Interviews are conducted via Google Meet, requiring active camera connection.

Benefits & Perks

  • Fast-paced, team-oriented remote environment.
  • Competitive compensation & meaningful equity.
  • Outsized responsibilities & professional development.
  • Foundational training (functional, customer immersion, development).
  • Medical, dental, and vision insurance (varies by region).
  • Unlimited paid time off (minimum 20 days required).
  • Paid parental leave (varies by region).
  • Flexible stipends for workspace, well-being, and professional development.
  • Company MacBook.

Phaidra is an Equal Opportunity Employer and participates in E-Verify.

Apply Now

This job is active but will expire soon. Click below to apply on the company's website.

Apply for this role ↗

Share Job

Know someone who would be a perfect fit? Share this opportunity.

Job Overview

Posted6/5/2026
CategoryFullstack Development
SourceJobsCollider

FAQ

Is this position remote?

The Data Center Reliability Engineer role is a remote opportunity. The location specified is Remote Worldwide.

What is the salary?

The salary is not explicitly stated, but is competitive and based on experience.

How do I apply?

You can apply by clicking the "Apply for this role" button above to submit your application on the hiring website.

Similar Opportunities

2

Staff Accountant

2e128041 322b 492a 903e 50c7a6a31068 19000101 000001Bronx, NY, US, Bronx, NY🏠 Remote
Competitive
Fullstack Development
View Job →
3

CNA Certified Nursing Assistant (EVENING SHIFT)

3db3d6a8 1de1 4ffc Bf60 68f204d46c0a 19000101 000001Wallingford, CT, US, Wallingford, CT🏠 Remote
Competitive
Fullstack Development
View Job →
National Veterinary Associates

Veterinary Assistant

National Veterinary AssociatesUSA🏠 Remote
Competitive
Fullstack Development
View Job →