Data Demystified: Solving the B2B Data Problem

Dan Shewan

Dan Shewan

Senior Content Manager

Data isn’t just an abstract concept at ZoomInfo — it’s the lifeblood of our entire suite of products and the engine that drives our customers’ growth. 

To the layperson, there may not be a huge difference between business-to-business (B2B) and business-to-consumer (B2C) data — it’s all just information. But to our engineering, data science, and product teams, B2B data is an entirely different animal from B2C that poses many unique obstacles and challenges.

In this installment of our Data Demystified series, we explore what it’s like to work with B2B data, and how our product teams invent and introduce new products and features.

Exploring ZoomInfo’s Intelligence Layer

Before our engineering and product teams can build dynamic data products, they need to identify, gather, and verify the underlying data that serves as the base of ZoomInfo’s intelligence layer.

You can think of our intelligence layer as the foundation upon which the ZoomInfo product suite is built. The data is gathered from millions of sources of information. Everything from corporate websites to social media updates to email signatures can be an information signal, which we then analyze, examine, and update constantly to ensure a reliable stream of up-to-the-minute information.

One of the biggest challenges for our data scientists and researchers is verifying that this information is correct. 

Take your personal email address, for example. The chances are pretty good that you’re still using the same personal email address you’ve used for several years, as most people don’t tend to update personal contact information frequently. 

Now think about how many times you’ve changed your work email during the past 10 years. If you’ve worked two or three jobs during that time, even at the same company, you may have changed your work email multiple times. To complicate matters, many people don’t update their professional contact information as proactively as they do their personal details. 

This means our engineers, data scientists, and researchers must take great care to validate and qualify this business information to ensure our algorithms can more accurately identify the most current data.

Diving Deeper into the Data

Email signatures are one of the richest, most reliable sources of up-to-date B2B data. It’s one of the first things employees change when transitioning into a new role, which makes it a reliably strong data signal for our product teams.

“There’s often no better source of professional information than your email signature,” says Derek Smith, ZoomInfo’s chief strategy officer. “We’re not only getting phone numbers and titles and emails, but also evidence that a contact is still employed.”

Part of the challenge of working with B2B data is how long it can take for a notable change to be made public. Sources such as LinkedIn can be valuable, but they often rely on users to manually update their information, which can be inconsistent. In these instances, our technologies and researchers have to go deeper to infer when changes take place by analyzing other data points in context, such as updates to professional contact details or changes to organizational charts.

“When people leave college and take their first job, we can learn about them accepting a role at a given company, even if they don’t sign up for LinkedIn, by observing business activity,” Smith says. “That helps us to grow our database, develop a really unique data set, and keep our business data incredibly clean.”

Identifying specific data points is only part of the puzzle. To ensure we have clean, reliable information, our data and engineering teams also have to evaluate the accuracy and credibility of data coming from disparate sources. 

“All of these sources have different levels of credibility,” says Meghan Collier, a data and engineering product manager at ZoomInfo. “These sources have different origins. They give you conflicting information. That’s where I come in as the bridge between our data analysis team and our data engineering team.”

Verifying data accuracy isn’t always about identifying correct information. At times, incorrect or outdated information can also tell a valuable story. If someone’s email address no longer works, it probably means they moved into a different role or left the organization — additional data points for further contextual analysis.

Building Better Models

Data accuracy at ZoomInfo relies on a combination of algorithmic, machine-learning technologies and human insight. However, it would be inefficient and impractical for our research team to manually evaluate individual data records. Much of the research team’s time is spent training our machine-learning models how to better identify and classify data inputs, and assess how trustworthy they are.

“The researchers teach our data scientists exactly what a good contact looks like, what a bad contact looks like. And that feedback is fueling our algorithms and making them better and better,” Smith says. “If you give really smart data scientists billions of data points, they’re going to come up with algorithms that do a good job of providing good data.”

ZoomInfo’s approach to validating data and improving the accuracy of machine-learning models is iterative, but far from linear. It’s a complex process that requires multiple teams to work together, constantly informing each others’ work and handing off improvements and iterations. It’s also a process that doesn’t end when those data models are put into production for our customers.

“The data science team builds the model,” Collier says. “It’s then analyzed by the data analysis team, then sent to research to validate. When we’ve decided this is how the model should be, the data engineering team, which is the team I’m on, takes it and puts it into production. We can then monitor it afterward.”

Solving New Problems

Customer feedback and competitive intelligence are major drivers of innovation at ZoomInfo.

In certain scenarios, new potential use-cases surface from conversations with current and prospective customers. In others, opportunities to use the vast B2B data asset emerge organically, providing our product teams with hypotheses they can test before putting new features into production.

“We get an overwhelming amount of feedback from customers and from sales reps,” Smith says. “There’s the data that you see on the platform, and then there’s an incredible amount of data under the hood that isn’t quite ready for game time. If one customer asks for a feature, we’re not going to overreact and blow up our roadmap, but there are definitely themes that become apparent.”

ZoomInfo’s data and product teams use this feedback to evaluate how existing features are performing and how they might be improved. Our analysts examine how specific product features are being used and the actual results of those features. Our researchers also monitor data traffic carefully to identify mentions of specific competitor products and features to identify opportunities for potential product development.

Imagining the Future of B2B Data

The next challenge for our B2B data and product teams is to expand opportunities for more businesses to benefit from the power and insights of the ZoomInfo platform.

“We can build products that have features and capabilities that other companies will never be able to offer,” Smith says. “We have analysts that we use to help us understand where the market’s going. The number one opportunity is international growth. We’ve invested a lot in the growth of our data in Europe, but there are developing areas of the world where prospecting is just now taking off.”

One of the most significant areas of opportunity is applying ZoomInfo’s data extraction technologies to languages other than English. This includes Arabic, Chinese, Japanese, and other languages that, until now, have been underrepresented. This presents us with the unique opportunity to diversify our underlying data asset and bring ZoomInfo’s value to businesses and audiences all over the world.

Another goal for our data and product teams is helping our customers understand how data works and how they can use it to grow their businesses. According to Smith, that means solving new problems in new ways to demonstrate lasting value.

“What we try to do across our portfolio is build products that are made better by our data,” Smith says. “We’re really becoming an end-to-end platform, the go-to-market engine for sales and marketing people. I’m really excited about that transition because it’s allowing us to do so much more for our customers.”