Default hero image

Understanding the Language – Open & Big Data in Construction

Open and big data offers many opportunities for the built environment, however it can appear overwhelming for those working in the sector. BRE in collaboration with Constructing Excellence group Generation for Change have been investigating the potential for data in the built environment. The study looked to demonstrate the potential benefits that collecting, managing, analysing and even releasing data can have on a range of organisations within the construction sector.

From the outset it became clear that one of the major hurdles preventing representatives from the construction industry participating in discussions surrounding data was the often complicated concepts and language surrounding it. This article aims to provide clear and concise definitions for some of the regularly used terms in this area. Further articles are planned on utilising data and how can open data help to develop the construction industry.

Internet of Things

Although data can be collected in a variety of ways, a substantial decrease in cost to send recorded data automatically over a network has led this method to become preferential. This ability to automatically receive data about a particular ‘thing’ has led to the concept of the Internet of Things (IoT). This is where technology has been utilised in order to collect data on various ‘things’, before being sent to a desired location via the internet. This recorded information could be anything, such as the location of a site vehicle, the temperature of a particular room, or even the amount of energy a building is consuming.

A great example of this is with automatic meter reading (AMR). An organisation could have a particularly large site with many meters and sub-meters recording gas, electricity and water consumption. Without the ability to automatically record and send this data over a network, it would require the Facilities Management (FM) team to manually read, and record this data at every point in time they desired. However, by utilising automated technology, the site can both collate readings a lot more efficiently, and also record them much more frequently. This will allow the FM team to analyse a much more in-depth dataset, in order to establish which areas of the site require the most attention.

Back to top

Open data

[pullquote]‘Data that anyone can access, use and share’ Open Data Institute[/pullquote]

After collecting data, an organisation can receive a range of benefits by actually opening it up to the world. This can range from an improved reputation through increased transparency, all the way to business development opportunities, which will be discussed in further detail later on. Currently the UK government has opened up a range of data surrounding mapping, health, spending, transport, crime, the environment and several other areas . However, during this study it became apparent that people have different views on what they consider to be open data. Within these differing definitions the complexity can vary substantially which further complicates the situation. In an attempt to simplify the language in this area, the study considered those definitions which are the most concise.


This explains very simply what open data is. It is data that has been opened up for other individuals or organisations to utilise. However, there are often discussions on what other stipulations the data must meet for it to be considered truly open. One of the main discussion points surrounds the format of the data. Many believe that open data should be released in a format which allows people to utilise it. For example if data was released in a closed format like a pdf, it would be difficult for people to learn anything from it. Alternatively, if it was released in a spread sheet it could be combined with other datasets, allowing people to undertake analysis which could lead to new or improved applications. However, not everyone feels this way. Many would just like to see data being opened up, regardless of format, and by forcing everyone to format data in a particular way could prevent many organisations getting involved.

Another commonly debated point surrounds the licensing of the data. Many consider that for a dataset to be truly considered open, it should carry no licensing or usage restrictions. This means that anyone can use and modify the data for their own commercial benefit. However, this is where the definition of open data becomes less clear. This is because others believe that a dataset has still been opened even if it carries a licence which either prevents alterations to the work, or the commercial utilisation of it. It is these uncertainties which can often lead to people being confused about a particular term. It is likely that similar discussions around data will continue to occur. Therefore, it is important to gain a basic understanding of the concept of open data (the definition above is a good starting point), and this will enable people to develop their own opinions on specific areas of debate.

Releasing open data

Case study: UK Government

The UK Government has been one of the prominent forerunners with regards to the release of open data (this will be discussed further later in the report). In an attempt to appear more transparent, the Government are allowing the public to see the data which drives their policies . In order to successfully undertake this, the Government have now placed all of their open datasets in one easy to search location,

There is a range of different datasets (Figure 1), many with explanations which illustrate how to understand the data (Figure 2). For example, one dataset that has been released is information surrounding registered residential property sales in England and Wales since 1995 . This particular dataset is updated monthly, and currently contains over 24 million records. Furthermore, it can be downloaded in various formats. This allows users to choose the format which best allows them to analyse it.

Figure 1, Preview data from the Price Paid Data
Figure 1, Preview data from the Price Paid Data

Figure 2, Explanation on how to use Price Paid Data
Figure 2, Explanation on how to use Price Paid Data

Back To Top

Big Data

Big data, as its name suggests, is simply a very large amount of data. McKinsey Global Institute defines big data as ‘datasets whose size is beyond the ability of typical database software tools to capture, store, manage, and analyse’ (McKinsey Global Institute). Depending on what datasets they hold, if a company is able to put in place measures which enable them to effectively capture, manage and analyse big datasets it can provide them with a much better understanding of their business, customers, products and competitors. This subsequently can lead to improvements in efficiency, reduced operating costs, improved products and services, and potentially increased sales.

Open data and big data are two commonly utilised terms used to describe data. They are not the same thing as each has defined characteristics. They are two very different definitions. Therefore, it is possible for a dataset to be both big and open; neither big nor open; or simply one of the two.

Back To Top


As discussed previously, when either utilising or releasing open data it is important to understand what licences are attached to it, and what those licences allow you to do with that dataset. There are a range of different licences, each with varying conditions with regards to their utilisation. An excellent place to learn more about the different types of licences which are available, is from the non-profit organisation Creative Commons. Creative Commons have developed free tools which enable users to develop simple standardised copyright licences based upon how they want people to use their data. This licence can then be attached to the data prior to it being shared. Tools like this can help to break down the initial barriers to data release, by providing an easy-to-use method for protecting organisational data (Figure 3).

Figure 4, An example of how Creative Commons can be used to develop a license for an organisations work
Figure 4, An example of how Creative Commons can be used to develop a license for an organisations work

Back to top


When an organisation is considering opening up their data, it is important to think about what this will display. The level of information or detail in which a dataset is released in is called its granularity. For example, if BRE were to release data regarding the energy efficiency of a sample of UK dwellings, they would need to provide the right amount of granularity for security reasons. This could involve anonymising the data in order for the occupants’ personal data to be kept private. The granularity of data can also be relevant in other situations. For example, in the future when projects are registered with the BREEAM certification scheme, there is an agreement that this data can be used by BRE, provided it has been aggregated in a way that people cannot pinpoint the performance of a particular development. This will allow the industry to better understand what levels are being achieved for a range of criteria, whilst simultaneously preventing information regarding a particular construction project being scrutinised. With all this in mind, it is easy to understand why considering granularity at an early stage is essential.

Back to top


In 2014, the BRE Trust commissioned the study ‘Open & Big Data in Construction’, which was undertaken by BRE in collaboration with Generation for Change (G4C, part of Constructing Excellence). This project set out to engage with the construction industry to discuss, and identify solutions to, the barriers currently slowing the use of open and/or big data within the built environment. In order to achieve this, the study looked to demonstrate the potential benefits that collecting, managing, analysing and even releasing data can have on a range of organisations within the construction sector.

One of the most prominent barriers to data utilisation identified during the study was that those who are relatively new to the area often feel overwhelmed by the technical language used. Subsequently, this report also looks to simplify wording and clearly explain some of the most important issues. The main objective from this study is to raise awareness of the benefits that data utilisation can offer, and simultaneously increase the level of data literacy along the supply chain. A combination of structured interviews, as well as debate events have been undertaken to stimulate widespread opinion, as well as to provide expert knowledge on this area.

Back to top


This project relied on input and support of a number of people from across the industry:

  • Antonio Pisarno (Marcel Mauer Architects / G4C)
  • Stuart Chalmers (BRE)
  • Professor Tim Stonor (Space Syntax)
  • Stephen Wooldridge (Barratt Homes)
  • Ben Cave (formerly worked with Citadel on the move, now with the ODI)
  • Tom Brown (Lambeth Council)
  • James Johnston (Open Utility)