Perplexity AI Inference Intern London | 2–3 Spots

CompanyPerplexity
RoleUK Internship Program — AI Inference Team
LocationLondon, UK (hybrid: 3 days office / 2 days WFH)
Duration13 weeks (full-time or part-time)
Class sizeOnly 2–3 intern spots (2026 class)
EligibilityMaster's or PhD in CS/Engineering; 2025–2026 academic year
VisaNo visa sponsorship; student visa holders need university work approval

Overview

Perplexity is hiring 2–3 exceptional Master's or PhD interns for its AI Inference team in London. You'll optimize serving latency and throughput for models from single-node embeddings to distributed sparse Mixture-of-Experts deployments — from GPU kernels through networking and monitoring.

Key Requirements & Critical Rules

  • Degree: Pursuing Master's or PhD in Computer Science or Engineering (2025–2026 academic year).
  • Focus: Performance-related subjects — HPC, compilers, distributed systems.
  • Technical depth: Strong systems fundamentals; multi-threading, networking, compilation, systems programming.
  • ML/GPU: PyTorch/JAX; CUDA, Triton; OpenMPI / HPC experience.
  • Work: Improve inference latency/throughput; new model support; quantization and stack-wide optimization.
  • Schedule: 13 weeks; hybrid 3 days office / 2 days WFH in London.
  • Spots: Only 2–3 interns in the 2026 class — highly selective.
  • Visa: No visa sponsorship; on student visa → university must approve work eligibility.
  • Not provided: No housing or health insurance for interns (FT employees get benefits).
  • Outcome: Outstanding performers may receive full-time offers (no fixed cap).
  • Apply: Official Perplexity website application required.

How to apply Apply Here




Related Posts