Building a Resilient Hardware Fingerprint
Binding software to a machine usually means trusting one identifier that is brittle or spoofable. A pairwise-hash fingerprint survives a component swap without losing the machine's identity.
A lot of software wants to know which machine it is running on. Licensing, device binding, fraud checks, "this install is authorized for this computer." The usual approach is to grab one identifier (a MAC address, a disk serial, a motherboard UUID) and treat it as the machine's identity. That approach is both too fragile and too weak. Too fragile because a single hardware change breaks it. Too weak because a single identifier is the easiest thing in the world to spoof.
I wanted something better: a fingerprint that stays stable when a machine changes a little, and changes when the machine is actually a different machine. That is hwprofiler, a small cross-platform tool that reads a handful of hardware identifiers and folds them into one serial.
What it reads
Four components, each reduced to a normalized string ID:
| Component | What it captures |
|---|---|
| CPU | vendor, family/model/stepping, brand string |
| Disk | block size and sector count |
| Motherboard | manufacturer, product, firmware vendor and version |
| MAC | the network adapter's hardware address |
Individually, none of these is a good identity. A MAC can be changed in software. A disk gets replaced. A motherboard reports the same product string across thousands of identical units. The trick is not finding the one perfect identifier, because there is not one. The trick is in how you combine them.
The pairwise scheme
The naive combination is to concatenate all four IDs and hash them once. That gives you a single fragile value: change any one component and the whole fingerprint changes, so you are right back to the brittleness of a single identifier.
Instead, hwprofiler hashes every pair of components. Four components give six pairs:
h0 = H(CPU ‖ Disk) h3 = H(Disk ‖ Mobo)
h1 = H(CPU ‖ Mobo) h4 = H(Disk ‖ MAC)
h2 = H(CPU ‖ MAC) h5 = H(Mobo ‖ MAC)
Each H is the first 16 bytes of SHA-256(id_a ‖ id_b), in lowercase hex. The
final serial is the first 16 bytes of SHA-256(h0 ‖ h1 ‖ ... ‖ h5).
The output is a JSON object with the six pair hashes and the serial, or just the 32-character serial on its own:
{ "serial": "<32-hex>", "h": ["<32-hex>", "<32-hex>", "..."] }Why pairs
The point of the six pair hashes is what they let a verifier do. Swap the disk,
and only the three pairs that involve the disk (h0, h3, h4) change. The
other three are untouched. A verifier holding a known-good profile can compare
pair by pair and decide how much change to tolerate: require, say, four of six
matches, and a single component replacement still identifies the same machine,
while a wholesale swap of everything fails.
That turns a binary "same or not" into a graded measure of how much the machine has changed. You get stability against ordinary upgrades and repairs without giving up the ability to notice when you are looking at a genuinely different box. The single-hash approach cannot express that at all.
One codebase, many platforms
The same logic runs on Linux, Windows, the BSDs, and (compiling, at least) macOS, built with Meson from a single source tree. Each platform has its own way of reading CPU, disk, motherboard, and MAC details, but they all normalize down to the same ID format, so the serial for a machine is the serial for that machine regardless of which OS computed it. The tool can print the profile, print just the serial, or POST the JSON to a validation server.
What it is, and is not
It is worth being honest about what a hardware fingerprint actually buys you. It is identification, not authentication. It raises the cost of pretending to be a different machine, it does not make it impossible. A determined attacker in a VM can present whatever CPU, MAC, and board strings they like, and a fingerprint will faithfully hash the lies. Privacy matters too: a stable machine serial is a tracking identifier, and it should be treated like one.
So this is a bar-raiser, not a lock. It defeats casual license sharing and makes cloned installs visible, and it does so without betting the machine's identity on one value that a new SSD would erase. For a lot of real-world device-binding problems, "stable, graded, and hard to fake casually" is exactly the right amount of security, and pretending you have more than that is how you end up trusting a fingerprint you should not.