The infrastructure behind AI is mostly invisible. We are making it legible.

Plain Theory Labs is a research and engineering lab building the first independent certification framework for sustainable AI compute infrastructure. We measure what institutions are actually operating, grade it against a defensible methodology, and produce findings that hold up to scrutiny.

We are AI and HPC infrastructure engineers with careers inside these systems — at research universities, national laboratories, and academic medical centers. We have configured the schedulers, tuned the GPU allocations, read the telemetry. We know what efficient looks like and what it doesn't. We care about infrastructure that works well now and remains defensible as the scale and scrutiny of AI compute increases.

What actually happens when you use AI

Every query sent to a large language model, every image generated, every inference run on a research cluster triggers a cascade of physical events most people never consider. Somewhere, a GPU wakes up and draws between 300 and 700 watts. A cooling system — chilled water loops, computer room air handlers, or direct liquid cooling — compensates for the heat. A power delivery infrastructure converts and conditions electricity from the grid. A data center manages space, redundancy, and thermal load, consuming water at rates that vary enormously by design and climate.

A single A100 GPU running continuously for a year consumes roughly 3,500 kWh of electricity. A modest research cluster of 100 GPUs draws more power than a small neighborhood. The largest AI training runs now require data centers with power capacity measured in hundreds of megawatts. This infrastructure exists, it is operating today, and its footprint is growing faster than anyone is measuring it rigorously.

The people who use AI daily are not wrong to do so. The engineers who build these systems are not acting irresponsibly. The problem is that no one has built the instrumentation to know, with precision and accountability, how efficiently any of this is running — or what the gap between current operations and what is achievable actually looks like.

The measurement problem

Monitoring tools exist in abundance. HPC centers run Prometheus, Grafana, DCGM, and Slurm accounting. They collect utilization rates, job runtimes, power draw, and queue statistics continuously. The data is there. What does not exist is a methodology that turns that data into a grade — a defensible, assumption-explicit score that tells an institution not just what is happening, but how it compares to what is achievable, and what a specific operational change would produce.

Without that methodology, the monitoring data has no policy value. An HPC director cannot write a procurement requirement around a utilization rate with no reference point. A sustainability office cannot report against a number with no standard. A funding agency cannot evaluate efficiency claims with no independent basis. The tools measure. Nothing grades.

Building the certification framework

Plain Theory Labs is building that standard. Our approach is an engine-based certification framework — modular analytical tools that ingest real operational data, apply transparent methodologies with fully disclosed assumptions, and produce graded findings that institutions can use for policy decisions, procurement requirements, and public reporting.

The framework is designed to be independent, which means its methods must be open to scrutiny. Every coefficient used in an environmental impact estimate is documented and configurable. Every recommendation carries a confidence label that reflects the quality of the underlying data. Nothing is asserted that cannot be traced to a source.

ACE: our first engine

The Adaptive Compute Efficiency Engine (ACE) is the first tool in the certification framework. It ingests Slurm scheduler exports and GPU telemetry from an HPC cluster, scores every job against a documented efficiency methodology, and produces a single-file report with findings, confidence labels, and fully disclosed assumptions. We validated it on a published, peer-reviewed dataset from a production system.

MIT Supercloud HPCA22 dataset — 35,745 production jobs

25.7% average GPU utilization

16% median GPU utilization

41% of jobs ran under one minute

33% of jobs showed near-zero GPU activity

These are not projections. This is what the engine found.

Read about the ACE engine

Advancing science requires sustainable infrastructure

AI is not only a consumer of compute — it is increasingly central to how science gets done. Drug discovery, climate modeling, genomics, materials science, and fusion research all depend on GPU clusters operating at scale. The scientific community has a direct interest in infrastructure that is efficient, accountable, and built for the long term.

Sustainability and scientific productivity are not in tension. A cluster running at 15% average GPU utilization — which is common — is not just wasting energy. It is wasting research capacity. Right-sizing workloads, improving scheduler efficiency, and understanding where compute is going are simultaneously environmental and scientific priorities. The institutions doing the most ambitious science should be operating the most efficiently instrumented infrastructure.

Collaboration is how this gets built. HPC directors, sustainability officers, research computing teams, and infrastructure engineers at universities and national labs have the operational knowledge and the data. We are building the methodology and the tooling. The findings belong to the institutions that produce them.

The regulatory context is arriving

The EU AI Act includes provisions requiring environmental impact assessment for high-impact AI systems. Federal AI infrastructure spending in the United States is under growing scrutiny from oversight bodies and appropriators. NSF and DOE programs are beginning to ask sustainability questions in infrastructure grant reviews. Several states have introduced or are drafting data center efficiency legislation.

Institutions that wait for regulation to define the standard will be graded against a methodology they had no role in building. Institutions that engage now will have operational data, a tested methodology, and a track record when external requirements arrive. The certification framework we are building is designed to be that track record.

Work with us

We are looking for research universities, national laboratories, and academic medical centers that want to understand what their GPU infrastructure is actually doing. Not a vendor pitch. Not a monitoring dashboard. A rigorous, independent assessment with traceable methodology.

Get in touch