About Flashmind Labs
We are building foundation models that engage with reality directly and evolve beyond their training data. Models that are calibrated, unconstrained in their pursuit of truth, and built for decisions with real consequences.
We are a focused team of researchers and engineers in Dubai. Come build the future of learning systems with us.
About this Role
The quality of our models depends directly on the quality of our infrastructure. We're looking for engineers to own the systems that source, process, and serve data at scale, and to build robust, efficient training pipelines. You will work at the intersection of data engineering and ML systems, ensuring our researchers can iterate fast on architectures and training objectives.
You will work with a team of scientists and engineers, including:
- Designing and scaling data ingestion, cleaning, and processing pipelines
- Building and optimizing distributed training infrastructure
- Developing tooling for dataset inspection, curation, and quality monitoring
- Implementing efficient data loading and preprocessing for large-scale training workloads
Minimum Qualifications:
- Bachelor's degree or equivalent experience in Computer Science or a related field
- Proficiency in Python and Rust
- Experience with cloud infrastructure (AWS, GCP, or Azure)
- Understanding of machine learning fundamentals and accelerator-based (FPGA-like) compute environments
Preferred Qualifications:
- Strong track record building and maintaining ML training infrastructure
- Experience developing, testing, and maintaining large-scale distributed systems
- Proficiency in data processing frameworks (Spark, Beam, Dask, or similar) and formats (Parquet, Arrow)
- Experience with Kubernetes and containerized workflows