All science is based on statistics. Science relies on data, whether it is used to prove a hypothesis or to create a mathematical prediction. Statistics even became its own subset of mathematics. Once computers became the primary tool for scientists to collect and analyze large amounts of data, a new subset of computer science was proposed. The idea behind “data science” was to create new and better ways to prepare, present, and model data and push the boundaries of statistics beyond theory and practice.
Now data science is a household word (even though most people probably couldn’t define its meaning) and has bled into every facet of our lives. Businesses in every industry already harness the predictive power of statistics thanks to data science tools and techniques. But one industry in which data science has not materially changed is real estate.
There are good reasons for this. Real estate data comes from various sources, so it is not standardized enough to be useful. For a real estate company to be able to run models on properties, all of the company’s data has to be documented in the exact same way. Even something as simple as a non-disclosed unit number could throw off a statistical model. For small queries, data can always be double-checked by a human, but the point of data science is to enable statistics at scale to get the most out of a model.
Automation like this is its own, messy science project, one that many property companies don’t have the expertise to take on. Since the average salary of a data scientist in the US is over $140,000, only the largest property firms have the resources to put together the kind of data science team that would be necessary to utilize modern data science techniques. “Managing data scientists is hard enough, but when they don’t even have data that is clean enough to be useful, it can be a very costly exercise,” said Ron Bekkerman, Strategic Advisor of real estate data platform Cherre.
What makes property data so challenging to utilize properly, from a data science perspective, isn’t just its lack of standardization; When it comes to property information, some of the most important pieces of data are not even available. “Property info by itself isn’t that useful,” Bekkerman said. “You need to connect it to other things like lenders, owners, tenants, and everything else in order for it to be helpful.” Data science is not just about understanding certain data fields. It is about learning the connections between different variables.
In order to make those connections, all of the information about the property needs to be put alongside all of the people and organizations that have interacted with it. This can be incredibly difficult because of the lack of transparency around property ownership. Many properties are held by an LLC or a trust that are oftentimes set up for the sole purpose of owning (and obscuring ownership of) a single property. Learning who is behind this organization takes investigation and makes the modeling process painfully slow and oftentimes prohibitively expensive.
To help property companies better understand the entire real estate landscape Bekkerman and his team have created what they call a knowledge graph. After uncovering all of the information about each property in the country (including things like ownership and lending), they are able to put all of these points together in a giant universe of data. Every time there is some connection between two points on the graph, say properties that the same organization has owned at some point, they get grouped closer together.
Eventually, these connections form galaxies and can give data scientists clues about the correlations between properties, owners, brokers, and lenders. “Once you have all of the data graphed, you can start to make queries about it, such as what a certain property owner is doing or where a certain lender tends to make loans,” Bekkerman said. Data science has brought this level of analysis to other industries that, until now, have been lacking in real estate.
Data science is as old as statistics, but the invention of computers has turned it into its own discipline. Data scientists have quickly become the most sought-after and well-paid professionals in the business world today. They are responsible for much of the business intelligence that most sophisticated corporations rely on. But despite data science’s advances in other industries, real estate is only now starting to leverage these tools. The property companies that embrace these innovations will be poised to make faster, better-informed decisions that are based on science. Data science.