Hi everyone, today’s interview is with David White, CEO of import.io, a web data platform and free web scraping tool that lets you transform any website into a table of data or an API in minutes without even writing any code. I’ve used this tool a few times for my projects and it’s been super helpful!
Today we’re talking about the importance of indexible data as the cornerstone to decision-making in businesses, his view of the semantic web, and how they used startup competitions as a (genius!) customer acquisition plan.
From Web Scrapers to import.io
There’s a huge amount of data on the web but the web was never designed for it. If you want to get data off the web and put it in a spreadsheet or database or somewhere you can make decisions from, it’s tough and you end up doing things like running web scrapers. Having written a fair few scrapers in his time, David felt this was a horribly inefficient way to use this incredibly valuable resource. So import.io is all about taking that enormous amount of data on the web and making it really easy for people to make decisions from it.
The business has been going for about 3 years and they’ve now got about 600,000 customers on the platform pulling about a billion data points a day, so they’re finally starting to get people to use the power of the web.
David was head of technology innovation at Royal Bank of Scotland—the largest bank in the world before it all went terribly wrong in 2008—and they were really struggling with making good, data-driven decisions. And as part of that journey, he really got excited about the possibility of using the web as a source of data to make better decisions.
Common (And Not So Common!) Use Cases of import.io
One of the common uses for the typical user is for price-based decisions. In a market where you’re highly dynamic on your pricing, how do you know how to price? how do you know the decision points that drive you to a market-competitive price? To really know that, you have to gather lots of prices from the web, from competitors, from adjacent businesses, etc.
Another common use on the free platform is lead generation. In fact, one of the co-founders, Andrew Fogg, has a great 20-minute presentation on YouTube called “10,000 Leads in 10 Minutes” (see Resources below). So those are two quite common use cases they see a lot. And, of course, you also get a lot of weird and wonderful use case, as well. He’s heard of an art installation in Barcelona that’s powered by data coming from their platform. Definitely not a typical use case!
The Semantics of the Semantic Web
The definition of “semantic web” dates back to the work that Sir Tim Berners-Lee was doing in the early 2000s. Having invented the World Wide Web and the web standard, there was a disparity between the original intention of HTML, which was a display language, and the reality of the type of information that was now being served on the web, which was a mixture: content and data were being mixed together using the same output.
So the semantic web standards were really about trying to separate those two to make it easier for a machine, whether it be a browser, a crawler or a scraper, to be able to identify data on a page and extract it. The problem was that the semantic web was not widely adopted. Hence, many websites were not actually marked up clearly with “this is a place, this is a person, this is a price, etc.” And import.io makes this data more usable.
For example: if you go to a shopping website, you want to be able to see what the product’s name, description, price, etc., is and do that in such a way so that on every single page of that site, and many other sites that hold similar content, you have a clear identification of each of those elements and then be able to express them in a table format.
Because the reality is once you get data into a table, you can do really cool things with it—even as simple as sorting—but until you get it into a table, that’s not possible. And data is the cornerstone to decision-making in businesses.
Free Tool to Lead Tool—A Natural Ecosystem
When you use their free tool, you typically go onto a page and click on an element on that page and say, “I want this piece of data and I’m going to describe it as a product name.” What import.io does is take all those descriptions of product names convert the semantic description to how it is typically expressed in HTML. So, as David points out, this is a translation of languages: from English to HTML.
There are many different ways to mark up that data, but after having seen enough examples—and they’ve seen millions of examples—they’ve gotten really good at an algorithmic level, at going on a page and saying, “In a probabilistic sense, that element is actually describing a product name based on what is around it, based on the page.”
Their free tool is also their primary lead tool for the paid data sets, because typically people come to their website and want to do some quick prototyping with data, but if they have a long-term production requirement then David’s team will reach out to those customers and say, “These are the benefits of taking data as a service rather than continuing to build it yourself through the free tool or writing scrapers, etc.”
A natural ecosystem!
Startup Competitions as a (Genius!) Customer Acquisition Plan
They didn’t have any master plan but they did have a pretty good idea about who the customers were, so they didn’t go into business without first having done a reasonable amount of customer discovery up front. They honed it down to three primary personas, and the two they were really focused on were data analysts and data scientists.
They decided that the best way to get immediate attention from these two personas was to enter a startup competition, which was run by Strata who does conferences for big data guys. They won the competition, which gave them their first market presence and got them some traction around the private beta. For 6 months they focused on those couple thousand customers to learn what they were doing wrong.
That process of going through a startup competition helped them define their pitch, as well as their features and benefits, and the press attention from winning helped them fill up that private beta almost instantly. They then entered four other startup competitions, including one of the largest, the Dublin Web Summit, and won all of them! As a result at the Dublin Web Summit alone they acquired about 5,000 users within 24 hours of winning it.
Tips on winning startup competitions: it’s a bit like pitching to VC—you have to really grab their attention early on and really explain why this is a pain point that many people have, and you’ve got to be clear on what the USPs (Unique Selling Propositions) are. And it helped them with fundraising as well because quite often there were major VCs in the audience.
Using Events To Acquire Customers—And Avoiding These Mistakes
Their conference is called Extract and they ran their first one about 18 months in. Their purpose was to build a community and talk about opportunities in general (not just their company/product). In doing the first one they realized it was a great lead generation process because they had brought together people who had the problem and were also interested in solving it and were doing interesting things with data—which is obviously a huge target market for them.
They’ve run Extract four times now: the first one in London attracted 150 people, and the latest one was in San Francisco with about 600 attendees. These events turned into a major part of their marketing drive, but one caveat is that the content has got to be genuinely good and interesting. They made a policy from day one to not make content all about import.io, but rather about data stories. That’s an easy mistake to make.
Another mistake they made was that they didn’t follow up after the events. Get great speakers and build themes into the conference so that people enjoy the whole event and actually look forward to the next one—and don’t forget to follow up afterwards to maintain that excitement. Now they produce a series of blog posts about each individual’s talk to continue the momentum days and even months afterwards.
A Big Struggle They Faced While Growing import.io
Companies that scale very quickly—like 100% per annum—create some fairly unique problems within the organization. For example, processes that used to work really well break down quite often. You have to understand that growing an organization quickly inherently means you’re going to break the processes that supported you really well 6-12 months ago.
Hiring was one of those processes for them. They were really focused on hiring through their networks at first, which quickly became problem when they saw that managers were spending as much time on hiring as on their actual job. So the one lesson he’s learned is to accept the fact that whatever you put into place today will probably break down in 12 months’ time. And the fact that it breaks is perfectly ok because it indicates that your organization is growing, so embrace constant change!
Advice To His 25-Year-Old Self
He would tell his younger self to focus on the big problems and the big picture. You can get drawn into the details or winning every little battle, and you can even get complacent because the things you’re doing are working, but unless you’re trying to solve a big problem those are probably not the best use of your time.
His Ideal Day
Traveling makes it difficult, but broadly speaking David:
One Must-Read Book
David has given out about 20 copies of one of his favorite books called Bold: How to Go Big, Create Wealth and Impact the World by Peter Diamandis and Steven Kotler. He finds it fascinating because it really talks about the fact that the most interesting problems are also the ones that probably affect a billion people, so they’re the ones that from both a personal and intellectual perspective and a financial perspective are worth solving.
He recommends the piece (and so do I!) Maker’s Schedule, Manager’s Schedule by Paul Graham.
Resources from this interview:
Leave some feedback:
Connect with Eric Siu:
Eric Siu (@ericosiu) is the CEO at Single Grain, a digital marketing agency that focuses on paid advertising and content marketing. He contributes regularly to Entrepreneur Magazine, Fast Company, Forbes and more.