Blockchain analytics tools provide critical intelligence to compliance teams, regulators, and investigators. These professionals use that intelligence to uncover illicit activity, prioritize investigations, support enforcement actions, and ultimately hold bad actors accountable. But those outcomes depend on one thing: the quality of the underlying blockchain data.
If the data is wrong, investigators will waste time and resources chasing a false lead, and compliance analysts could miss sanctions exposure. The downstream effects can be even worse: a single incorrect attribution can discredit hundreds of related insights, undermine entire investigations and lead to wrongful customer terminations.
Selecting the right blockchain analytics provider is therefore a mission-critical decision. Evaluating the quality of the underlying data requires more than comparing feature lists or coverage claims. Any provider performing rigorous analytical work should be able to explain the methodology behind its conclusions, the evidence supporting them, the safeguards used to prioritize accuracy, and show that the claims hold up under scrutiny and independent testing. The following questions are designed to assess the rigor, transparency, and evidentiary standards behind a provider’s methodology as part of your due diligence.
How addresses get grouped
- How do you determine that multiple addresses belong to the same entity?
Some methodologies establish common ownership deterministically. Others infer ownership probabilistically. Both approaches can be useful, but it is important to understand which method is being used and when.
- What happens when your grouping methods get it wrong?
Every technique has blind spots. For example, CoinJoin transactions need to be identified and excluded from UTXO co-spending heuristics. A good provider has mapped out these edge cases and built protections against them, and not just assumed errors are rare.
- Do you use different techniques for different blockchains?
Different blockchains, like Bitcoin and Ethereum, operate in fundamentally different ways with distinct architectures, transaction models, and behavioral patterns. As a result, the techniques used to group addresses together should differ as well. If a provider uses the same general terminology across blockchains, ask what methodology is being applied.
How entities get labeled
- What evidence supports your labels, and how reliable is it?
A label confirmed by a reliable source — for example a dataset seized by law enforcement — is very different from one based on a single uncorroborated report, e.g an anonymous tip.
- If you changed the label, would the address grouping still hold up?
The grouping and the label should be independent. If removing a label causes the grouping to fall apart, neither claim stands on its own.
- Do you distinguish between who runs a wallet and who uses it?
When you deposit crypto at an exchange, your deposit address is linked to you, but the exchange controls it. Failing to distinguish between users and service providers can lead to incorrect ownership claims and attribution errors. A provider should be able to explain how it differentiates between who uses an address and who ultimately controls it. The same challenge exists with nested entities, where one company relies on another company’s custodial or wallet infrastructure. Robust attribution requires understanding not just who interacts with an address, but who ultimately controls it.
How methodology gets tested
- Has your methodology been challenged in court?
Legal proceedings test whether the methodology behind clustering and attribution can be submitted as evidence. A methodology that has satisfied the Daubert standard is fundamentally different from one that has never faced it, even if they seem similar on the surface. If a provider’s methods are relied upon in investigations, compliance decisions, or enforcement actions, understanding how those methods have performed under legal scrutiny can provide valuable insight into their rigor, methodology, and reliability.
- Have you participated in independent accuracy studies?
Opportunities to verify the accuracy of blockchain analytics are rare and valuable. But if law enforcement seizes wallet infrastructure, then outside parties can compare empirical evidence with attribution data. These moments provide a unique opportunity to validate whether a provider’s methodologies produce accurate results in the real world. Has your provider welcomed that kind of external testing, or avoided it?
- Where do you draw the line with machine learning?
ML is great for spotting patterns. But if ML outputs automatically get treated as confirmed facts, errors can quickly multiply. Understanding where a provider relies on machine learning can help you distinguish between evidence-based conclusions and probabilistic assessments that may require further validation. Ask how your provider leverages ML and if those outputs are not grouped with other methodologies and are clearly labelled as such.
- Can you explain how any given cluster was built?
For any specific cluster, a provider should be able to walk you through how it was constructed and what evidence supports it. If they can’t trace how a cluster came together, then their cluster might not be right.
Any blockchain analytics company should be able to provide clear and specific answers to these questions. They’re the product of transparency, quality control, and strong evidentiary standards — the same standards your investigations depend on.
To learn more about Chainalysis’s data standards, read Defining the Cluster, our formal ontology for blockchain address analysis and intelligence claims, and why we published it.
This website contains links to third-party sites that are not under the control of Chainalysis, Inc. or its affiliates (collectively “Chainalysis”). Access to such information does not imply association with, endorsement of, approval of, or recommendation by Chainalysis of the site or its operators, and Chainalysis is not responsible for the products, services, or other content hosted therein.
This material is for informational purposes only, and is not intended to provide legal, tax, financial, or investment advice. Recipients should consult their own advisors before making these types of decisions. Chainalysis has no responsibility or liability for any decision made or any other acts or omissions in connection with Recipient’s use of this material.
Chainalysis does not guarantee or warrant the accuracy, completeness, timeliness, suitability or validity of the information in this report and will not be responsible for any claim attributable to errors, omissions, or other inaccuracies of any part of such material.




