Systems Test & Analysis Engineer (Storage/Distributed Systems)
Your mission
We are seeking a highly curious and tenacious Investigator to join our team. While we have "builders" focused on creating tests, we lack a dedicated individual to examine failures, verify test validity, and deeply understand why they fail. This role is crucial for establishing amazing feedback loops that bridge development and support, ultimately enabling us to ship without fear.
If you thrive on digging through data, embrace complex debugging, and enjoy finding the critical clue in 20,000 lines of logs, this is the role for you. While we do hope for a candidate who possesses all the relevant skills for this position, we are also pragmatic: demonstrating a growth mindset and a willingness to learn and ramp-up knowledge is ultimately more valuable.
Your profile
The Mindset We Value:
Investigator: Driven by intense curiosity ("Why did it fail?").
Empirical: Don't guess – find a way to get the data.
Observability Minded: Committed to using data and metrics to understand the system's behavior.
Patience: The fortitude required to tackle complex, deep-seated issues.
Pragmatic: Delivers results that move the needle.
Required Skills & Experience:
Coding & Scripting: Strong proficiency in Python, bash, and potentially Go, for automation (including experience with tools like teuthology).
Debugging & Tracing: Expertise with system internals (strace, lsof, /proc), tracing technologies (eBPF), and general debugging (gdb).
System & Resource Analysis: Ability to analyze system resources, including load, iowait, and zombie processes.
Storage & Networking: Deep understanding and debugging experience with storage concepts (S3, HTTP errors, POSIX, ACLs, flock, etc.) and networking (tcpdump, wireshark).
Domain Expertise: Familiarity with tools for breaking storage (fio, fsx, elbencho, xfs test suite) and understanding of distributed systems principles (experience with tools like Jepsen or Antithesis is a plus).
Code Reading: Reading ability in C/C++
Why us?
- Autonomy and Ownership: We encourage you to work independently and take ownership of your tasks, while providing a supportive team environment for collaboration and guidance.
- Mentorship and Growth: We recognize that mastering complex topics requires time, focused effort, and expert guidance. Your colleagues will actively support you to ensure your success and continuous development.
- Owner-managed company: short decision paths and high adaptability
- Opportunity to grow in a dynamic, international technology environment
- Flexible working hours
- Collaborative environment with close interaction across teams
About us
As a Ceph Premium Partner, we offer innovative, vendor-independent cloud-native solutions for software-defined storage. We specialize in cyber-secure storage solutions for AI, hybrid clouds, and backup. In addition, we offer professional managed services for hybrid cloud solutions based on OpenStack and other open source technologies to ensure maximum flexibility and avoid vendor lock-in.
