Hello,

I'm Saad Aamir,
a software
engineer
building toward
AI safety.

I build full-stack systems and LLM tooling, including a Model Context Protocol server for my studio. Moving toward empirical work on the interpretability and oversight of frontier models.

Get in touch View work

33.6°N 73.0°E2024

What I do

I build software and the AI tooling
that operates it.

Three areas where I spend most of my time: engineering production systems, integrating language models into real workflows, and moving toward empirical research on how those models behave.

Full-Stack Engineering

Production systems across Python, TypeScript, and Node: analytics dashboards, internal portals, REST APIs, and cloud infrastructure on AWS. 3+ years shipping at MobileLIVE and independently.

AI & LLM Tooling

Practical integrations of language models into existing systems: MCP servers, RAG pipelines with LangChain and FAISS, structured outputs with Pydantic, and grounded workflows that don't hallucinate.

Research Direction

Moving toward empirical AI safety research: mechanistic interpretability, behavioral evaluations, scalable oversight. Currently self-studying transformer internals and applying to research-focused programs.

02 / About

Rawalpindi, PK@saadaamir

About

More about me.

I'm a software engineer based in Rawalpindi, Pakistan. I graduated from NUST in 2023 and spent the next two and a half years at MobileLIVE, a Canadian tech consultancy, building full-stack features across analytics dashboards, internal portals, and AI-integrated workflows.

In early 2026 I started Dark Matter Studio, an independent web design and development practice. Most of my own work these days is on LLM systems, the plumbing that makes model integrations reliable. I built Dark Matter Co-Pilot, a Python MCP server that lets Claude reach into the studio's pipeline as typed tools. I also built a sycophancy evaluation harness with a hand-validated judge and proper statistics behind it.

What pulls me most is empirical work on how models behave. Building evaluations carefully, measuring what models actually do instead of what they look like they're doing, and staying honest about what the data can support. That's the direction I'm steering toward, one project at a time.

Download CV GitHub Studio

Experience

3+ years of engineering, shipping,
and learning.

JAN 2026→PRESENT

Founder & Full Stack Developer

Dark Matter Studio · Independent

Running an independent web design and development studio. Shipping Next.js sites for clients across fitness, finance, and creative industries. I also build my own LLM tooling: Dark Matter Co-Pilot, a Python MCP server that exposes studio operations to Claude, and a sycophancy evaluation harness measuring how language models hold up under user pushback.

JUL 2023→DEC 2025

Software Engineer (Full Stack)

MobileLIVE · Canadian Consultancy · Remote

Owned full-stack feature development on Geotab Fleet Analytics: React dashboards backed by Node.js and TypeScript REST services, reducing query latency around 40%. Refactored the RAB and RDA Lighting portals from .NET 6 to .NET 8, introducing repository pattern and dependency injection to a legacy codebase. Integrated LLM-backed features into the PRF Portal, cutting manual workflow steps by half. Deployed and maintained AWS infrastructure with CI/CD via GitHub Actions.

JUL 2020→SEP 2020

Web Development Intern

ZSystems · Lahore

Built responsive frontends and Node.js/Express services in agile sprints with senior engineers. Shipped internal tooling for logistics clients and gained hands-on exposure to production deployment pipelines.

SEP 2019→JUN 2023

BS Computer Science

NUST · Islamabad

Final year project: NLP-based automated code review system using transformer models. Research assistant in the AI lab, working on sequence modeling for structured prediction tasks.

Selected work

Projects shipped and currently building.

Dark Matter Co-Pilot

shipped

2026

MCP server that gives Claude access to real studio data

An MCP server that connects Claude Desktop to my studio's operations data (past client work, live leads, voice docs, and outreach templates), so I can draft outreach, update leads, and query case studies through natural conversation.

PythonFastMCPSQLitePydantic v2

Case study GitHub Substack

Sycophancy Evals

shipped

2026

Empirical eval of LLM dispositional behavior under user pushback

A controlled experiment measuring how often two same-scale open-weight language models (Llama 3.1 8B, Qwen 2.5 7B) reverse correct answers under user pushback. Hand-validated LLM-as-judge, question-level bootstrap CIs, 1,800 conversations per model.

PythonInspectAnthropic APIOllamaPydantic

Case study GitHub Substack

AI Resume Matcher

shipped

2023

LLM-powered resume-to-JD alignment tool

Parses job descriptions and resumes using structured LLM extraction, then scores alignment across skills, experience, and tone. Reduced manual screening time significantly for early hiring pipelines.

PythonFastAPIClaude APIReactTailwind

Case study GitHub

Dark Matter Studio

shipped

2023

Landing page for my web studio

The public site for Dark Matter Studio, my small web studio based in Pakistan. Headline: "Websites That Turn Visitors Into Clients." Built for speed and conversion. No CMS, just clean Next.js and Tailwind.

Next.jsTypeScriptTailwindVercel

Live

Toolkit

What I work with day-to-day.

languages

PythonTypeScriptJavaScriptC++C#SQL

ai / ml

MCPLangChainOpenAI APIFastAPIFAISSPydanticTensorFlow

backend

Node.jsExpressPostgreSQLMySQLMongoDBSQLiteDockerRESTJWT

frontend

ReactNext.jsReduxTailwindFramer Motion

cloud

AWSEC2LambdaS3CloudWatchCloudFormationGitHub ActionsVercel

Writing

Things I've been thinking about.

All posts on Substack

Llama folds when you sound vague. Qwen folds when you sound specific.

What I learned about LLM evals by getting fooled three times in a row.

View on GitHub

Jun 18, 20269 min readAI Safety

Tools should return data, Language models should return language

How I stopped trying to make my MCP tool write emails and let Claude do its job.

View on GitHub

Jun 11, 20265 min readMCP

Why 23 strangers in a room are more interesting than they look

Walk into a room with 22 other people. There's a better than 50% chance two of you share a birthday. That sounds wrong.

Try the demo

May 27, 20265 min readProbability

Why nice guys finish first (sometimes)

The Prisoner's Dilemma, Axelrod's tournament, and how cooperation actually wins.

Try the demo

May 23, 20265 min readGame Theory

Contact

Let's work together.

Open to research collaborations, consulting engagements, and senior engineering roles, especially anything adjacent to AI safety or applied ML infrastructure. Dark Matter Studio is also available for select client projects.

saadaamir473@gmail.com

Dark Matter Studio·

I'm Saad Aamir,a softwareengineerbuilding towardAI safety.

I build software and the AI tooling that operates it.

Full-Stack Engineering

AI & LLM Tooling

Research Direction

More about me.

3+ years of engineering, shipping,and learning.

Founder & Full Stack Developer

Software Engineer (Full Stack)

Web Development Intern

BS Computer Science

Projects shipped and currently building.

Dark Matter Co-Pilot

Sycophancy Evals

AI Resume Matcher

Dark Matter Studio

What I work with day-to-day.

Things I've been thinking about.

Let's work together.

I'm Saad Aamir,
a software
engineer
building toward
AI safety.

I build software and the AI tooling
that operates it.

3+ years of engineering, shipping,
and learning.