hero

PL Network

196
companies
473
Jobs

Machine Learning Systems Engineer

Gensyn

Gensyn

Software Engineering
Remote
Posted on May 24, 2023

The world will be unrecognisable in 5 years.

Machine learning models are driving our cars, testing our eyesight, detecting our cancer, giving sight to the blind, giving speech to the mute, and dictating what we consume, enjoy, and think. These AI systems are already an integral part of our lives and will shape our future as a species.

Soon, we'll conjure unlimited content: from never-ending TV series (where we’re the main character) to personalised tutors that are infinitely patient and leave no student behind. We’ll augment our memories with foundation models—individually tailored to us through RLHF and connected directly to our thoughts via Brain-Machine Interfaces—blurring the lines between organic and machine intelligence and ushering in the next generation of human development.

This future demands immense, globally accessible, uncensorable, computational power. Gensyn is the machine learning compute protocol that translates machine learning compute into an always-on commodity resource—outside of centralised control and as ubiquitous as electricity—accelerating AI progress and ensuring that this revolutionary technology is accessible to all of humanity through a free market.


Our Principles:


AUTONOMY

  • Don’t ask for permission - we have a constraint culture, not a permission culture.

  • Claim ownership of any work stream and set its goals/deadlines, rather than waiting to be assigned work or relying on job specs.

  • Push & pull context on your work rather than waiting for information from others and assuming people know what you’re doing.

  • No middle managers - we don’t (and will likely never) have middle managers.

FOCUS

  • Small team - misalignment and politics scale super-linearly with team size. Small protocol teams rival much larger traditional teams.

  • Thin protocol - build and design thinly.

  • Reject waste - guard the company’s time, rather than wasting it in meetings without clear purpose/focus, or bikeshedding.

REJECT MEDIOCRITY

  • Give direct feedback to everyone immediately rather than avoiding unpopularity, expecting things to improve naturally, or trading short-term pain for extreme long-term pain.

  • Embrace an extreme learning rate rather than assuming limits to your ability/knowledge.


Responsibilities

👉 Build execution frameworks - design and create low-level execution frameworks to train models on remote hardware

👉 Scale - build methods for handling datasets, models and training procedures across all shapes/sizes

👉 Design and own your code - take care of designing the frameworks and handle all maintenance related to keeping your codebase up to date

👉 Follow best practices - build in the open with a keen focus on designing, testing, and documenting your code

👉 Write & engage - contribute to technical reports/papers describing the system and discuss with the community


Minimum requirements

Framework development - experience interfacing with common ML Frameworks (Tensorflow, PyTorch, Onnx, etc) for your own abstractions

Distributed Systems Experience - experience working with many compute resources for large computations

Solving Problems that Scale - experience with working on Massive datasets/Large Language Models. Knowledge of Pipeline and Data Parallelism Techniques

Infrastructure - previously provisioned and/or configured large-scale infrastructure for ML training tasks

Background in relevant theory - i.e. computer science, machine learning, algorithms, distributed systems

Strong willingness to learn Rust - as a Rust by default company, we require that everyone learns Rust so that they have context/can work across the entire codebase

✅ Highly self-motivated with excellent verbal and written communication skills

✅ Comfortable working in an applied research environment - with extremely high autonomy

Nice to haves

🔥 Rust experience - systems level programming experience in Rust

🔥 Compiler-level experience - previously worked at the lower levels of the ML stack - on compilers or hardware implementation

🔥 Quantisation / constrained optimisation - previously worked on overcoming bottlenecks in ML training from hardware or communication constraints

🔥 Decentralised model training frameworks - exposure to frameworks like Moshpit SGD that effectively optimise models despite high latency, heterogeneous, compute environments

🔥 Open source work - experience working with large open source codebases (ideally as maintainer)


Compensation / Benefits:

💰 Competitive salary + share of equity and token pool

🌐 Fully remote work - we hire between the West Coast (PT) and Central Europe (CET) time zones

🛫 4x all expenses paid company retreats around the world, per year

💻 Whatever equipment you need

❤️ Paid sick leave

🏥 Private health, vision, and dental insurance - including spouse/dependents [🇺🇸 only]

Note: please only submit CVs in .pdf format.