THE global volume of data created, captured, copied and consumed has ballooned in recent years, increasing from two zettabytes in 2010 to 64.2 zettabytes in 2020, with predictions that it will reach 181 zettabytes in 20251.
This growth has largely been fuelled by new technologies and software in various industries including the construction industry which has introduced digital solutions such as building information modelling (BIM), virtual reality, electronic document management, artificial intelligence, robotics, 3D printing, drones and other data collection and monitoring technology.
The advent of big data poses new challenges for companies seeking to make use of the data produced.
It also presents an issue in the disputes context, as the resources required to collect, process and review large amounts of data is often in opposition to a litigant’s desire (as well as the objective of various legal codes and arbitral rules) to deal with cases efficiently and at proportionate cost.
Parties that can make the best use of this data will place themselves at a significant advantage in any dispute resolution process.
Against this background, there has been a growing acceptance lawyers, arbitral tribunals and the courts, of the use of sampling as a way to resolve this tension and to enable the prosecution of claims and use of data as evidence, in circumstances where it may otherwise be too disproportionately costly.
What is sampling?
Sampling is a means of finding out about the characteristics of a large population by asking questions of, or investigating, a subset of that population. The results of the investigation of the sample are then extrapolated to the whole population.
Parties that can make the best use of this data will place themselves at a significant advantage in any dispute resolution process.
The English courts have approved of sampling as a way to resolve a variety of construction disputes2. While traditionally sampling has been associated with claims relating to defects on construction projects where the value of each individual defect is too low to justify evaluating each defect individually3, the Court of Appeal has recently also approved its use in professional negligence claims4.
However, sampling has even wider application and should be considered whenever the quantity of data available or required is too substantial to justify a full review/analysis. For example, our team has worked on multi-billion US$ claims relating to delay and disruption occurring in the engineering process of mega energy infrastructure projects. These projects regularly involve the use of design data bases, 3D modelling software and the production, review and approval of 100,000s of design documents.
The review and approval of the 3D model and design documents can involve the exchange of 100,000s of further comments and communications between the owner, the main contractor, subcontractors and in some cases relevant authorities. In our experience, delay and disruption in the engineering process is one of the most common causes of major delays and cost overruns on large construction projects, but also one of the hardest to properly analyse and allocate responsibility in respect of.
Types of sampling
A key requirement of effective sampling is that the sample investigated must be representative of the relevant population as a whole. Broadly speaking, sampling can be carried out in two ways – first, by selecting samples at random (as in statistical/probability sampling) or by selecting samples which are intended/expected to be representative of the whole (as in non-statistical sampling).
While both methods are acceptable in principle5, a difficulty with non-statistical sampling is that as it requires subjective judgement in selecting the sample and it is difficult to demonstrate that the sample is truly representative of the population. Such studies are therefore particularly vulnerable to criticisms of bias, and it can be difficult to obtain buy-in from the other side and tribunal/court.
It is our view that statistical sampling should therefore be the preferred method of running a large sampling exercise; the selection process is random, and the results can be supported with percentage rates of confidence that can be moderated to account for bias in the process by stated degrees of precision.
How is statistical sampling carried out?
The statistical sampling process is broadly carried out as follows:
Getting it right
As we have set out, the English cases have established that it is, in theory, possible to prove a claim on the basis of sampling. However, the failure rate is high and in both the Amey and ICI cases, the claims failed due to sampling errors6.
Some of these errors were egregious and, as noted by the courts, may have indicated an intention to obtain a beneficial, rather than representative sample.
Some of these errors were egregious and, as noted by the courts, may have indicated an intention to obtain a beneficial, rather than representative sample. While advocates naturally wish to obtain the best results for their client, it is important to avoid the temptation to obtain what may be perceived as the most beneficial sample, as doing so risks completely invalidating the results of the sampling exercise.
Given the risks involved with pursuing a case on the basis of sampling, it is important to consider whether sampling is, in fact, appropriate for your case. It may be more appropriate, depending on the value of the claim and the volume of data to review all the data and bring a case on a traditional basis. Your access to the population, and the characteristics of the population, may also affect whether or not your sampling exercise is likely to result in a reliable result.
For example, if you are unable to include a substantial amount of your population in your sampling frame, or if your population is very varied in nature, it may not be desirable to use sampling or it may be necessary to introduce additional sampling techniques (such as stratification) in order to obtain precise results.
If sampling is the correct, or only, way to bring the claim, the case law, and our experience with running cases based on sampling, indicate that the following key points should be borne in mind.
Beware of Bias
Put simply, bias means there is a difference between the sample results and the population it is meant to represent. If there is a significant amount of bias, it indicates that the results are likely to have been skewed – often inadvertently – by the sample selection or measurement process. Most commonly, bias arises when particular members of the population are improperly given more chance of selection which means that the sample cannot be considered representative of the population. The presence of bias is often the reason why sampling claims fail.
Bias can be caused by a variety of issues in the sampling process, but common sources of bias include:
Agree on the use of sampling, and the methodology, where possible
It is preferable to agree with the other side of the sampling methodology and approach at an early stage. This will reduce the risk of methodology-based objections to the results, and allow for issues to be rectified before significant costs are incurred.
A court/tribunal will also be more able to compare and evaluate the parties' results and conclusions if the parties have agreed on an approach and methodology. If the approach taken by each part is too dissimilar, decisions are likely to be more unpredictable and chances of settling based on the results of the sample will be reduced.
Use an expert
Statistical sampling is complex, and is not an area that most legal professionals are expert or experienced in. It can therefore be helpful to appoint a statistical expert at an early stage to advise on the appropriateness of sampling and supervise the sampling exercise. If faced with a sampling claim, an expert would also be able to evaluate the sampling exercise undertaken by the other party.
Educating a court/tribunal on sampling is key, as many judges/tribunal members may never have come across a claim based on sampling in their career.
Aside from the importance of getting the process right, educating a court/tribunal on sampling is key, as many judges/tribunal members may never have come across a claim based on sampling in their career. An expert can be helpful when it comes to explaining the science of statistical sampling and addressing the uncertainty in the results that will always be present to an extent.
Conclusion
Construction projects and the disputes that arise therefrom are well-known to be data-heavy in nature. The increasing size and complexity of construction projects, coupled with the use of new technology, is only likely to contribute to an increasing need for techniques such as sampling to enable disputes to be resolved in a proportionate manner.
While the use of statistical sampling is not yet commonplace, the recent approval by the courts and increasing understanding in the legal field of the science of statistical sampling, indicate that the use of sampling will only grow in the years to come. We may well be, as Lady Rose, Justice of the Supreme Court, has suggested, at the start of statistical revolution7.
Daniel Garton, Partner and Primrose Tay, Associate, White & Case
Any views expressed in this publication are strictly those of the authors and should not be attributed in any way to White & Case LLP.
---
1 Based on IDE and Statista estimates
2 Building Design Partnership Ltd v Standard Life Assurance Ltd (2021) EWCA; Amey LG Ltd v Cumbria County Council (2016) EWHC 2856 (TCC)
3 See for example, Amey LG v Cumbria County Council (2016) EWHC 2856 (TCC) and Imperial Chemical Industries Ltd v Merit Merrell Technology Ltd (No.2) (2017) EWHC 1763 (TCC)
4 Building Design Partnership Ltd v Standard Life Assurance Limited (2021) EWCA
5 See Amey LG v Cumbria County Council (2016) EWHC 2856 (TCC) paragraph 25.103
6 See also In re Hardieplank Fiber Cement Siding Litig., No. 12-md-2359, 2018 WL 262826 at *26 (D. Minn. Jan. 2, 2018), and In re Chevron U.S.A., Inc., 109 F.3d 1016, 1019-20 (5th Cir. 1997), two U.S. cases which highlight the importance of a proper sampling methodology
7 A Numbers Game? Statistics in Public Law Cases (supremecourt.uk)