Hitmetrix - User behavior analytics & recording

DataSift wrangles social data for brands

“I couldn’t stop Salesforce being successful. I needed a new mission with a smaller company.”

That was Tim Barker’s tongue-in-cheek account of his leaving Salesforce’s EMEA marketing operation, where he’d risen to be European marketing leader, to become Chief Product Officer at the San Francisco-based social data platform DataSift. From “social data,” however, don’t infer social media management or social listening and publishing. We’re talking big data here–the data of the global social conversation. Handling it, as Barker told me, “is a hard engineering problem to solve.”

Barker started out as an engineer, in fact, with a computer science degree, before moving into the content and collaboration space. Koral, the on-demand content management platform he co-founded, and which was acquired by Salesforce less than a year after it launched, helped users share and work on the latest versions of documents, from any subscriber’s desktop–synchronized, fully indexed, and auto-tagged.

Now he’s helping build something else. DataSift grew out of Tweetmeme, a Twitter creation tool which helped users surface relevant content by category. (Ironically, Twitter Moments, which launched today, has Twitter perform that service on its users’ behalf–not necessarily a progressive step). Nick Halstead, Tweetmeme’s creator, re-positioned the technology to become a data curation service for brands–but on a huge scale.

There are something like five million pieces of social data created every minute, Barker told me. DataSift, however, “was never about owning lots of data. It’s the compute piece, not the storage piece,” he said. As Barker describes it, DataSift is the mirror image of Splunk, the intelligence platform which processes machine-created data. DataSift turns on the hose of human-created social data and makes it accessible and useable for clients.

Businesses usually start out with a brand-centric/brand reputation interest in social media, said Barker. But increasingly they want to use social data to understand not just conversations but audiences. “We train algorithms on data sets,” Barker said. “We do as much processing as we can on the data when we first see it. Further down the pipeline, brands can ask specific questions.”

Data categorization starts with Vedo, a trainable engine which applies out-of-the-box and/or custom taxonomies to social updates. Sentiment analysis is an important element in DataSift’s categorization tool-kit. The aim is not just to score posts for positive and negative sentiment, but automatically to tag mood and intent–for example, intent to purchase.

It’s a platform-as-a-service proffer, running in the cloud on a monthly subscription basis, ready for embedding in other solutions. Essentially, DataSift isn’t an end user application. It’s a “foundational technology,” said Barker, which provides the fuel for data-driven practices–especially marketing practices–but provides it in an intelligent, organized fashion, rather than as an undifferentiated hose of noise. Currently it pushes organized, tagged and scored data to over 1,000 companies.

It partnered initially with Twitter; then added blog data; and partnered with Facebook earlier this year. Facebook topic data (what “audiences are saying on Facebook about events, brands, subjects and activities”) is accessed via API within the DataSift platform, with demographic data wrapped in but personally identifying information stripped out. In other words, topic data provides rich insights into potential audiences, but doesn’t allow individual ad targeting (DataSift is committed to Privacy by Design). “It’s a way to work with a network’s own data,” said Barker, “but it can never be tracked back to individuals.” It makes DataSift a powerful processor of what Barker calls “unattibutable data.”

Of course, said Barker, “optimization can get you only so far.” It creates a temptation to throw money at things: hence what Barker called “the impending crisis of crap content.” The challenge is to drop the idea of creating once-and-for-all content, and do the hard work of building content for different audiences, available through different channels. “How can we apply new technologies, like deep learning, but simplify them enough to put them in the hands of marketers? Unless we can solve those kinds of problems, we can’t make the [marketing tech] market bigger, or democratize it to companies of all sizes.”

Related Posts