In the past I have talked about how we have designed and built our AI Factory to be ready for 2021 and beyond. We also discussed how a functioning data lake is a key contributor to enable business flexibility and the capacity to build products that are purely born out of data. It is time to take a closer look at the intersection of those two subjects by giving a concrete example: social media trend detection and prediction.
Many of our customers are wondering if there are valuable insights to be gained by looking at the social graph. And the answers is most likely yes, though it is substantially harder to figure out what those insights might be. In the following section you’ll find a collection that we have seen work in the past.
Assuming you have a functional data lake and the capacity to write data pipelines there are still tough issues ahead. First of all – legally obtaining social media data comes with limitations and rightfully so. You’re going to have to apply for proper Developer Credentials for the major social media platforms and make a case on what kind of data you plan on using and why. Most of them will also come with some limitations in terms of volume and what types of content you can obtain (for free). That being said, the free tiers are generally quite generous and will work for most use cases.
Can obtain global and local trends as well as public tweets and their metadata (500k tweets in the free tier). Those can be from any hashtag.
Can obtain trending videos for any country as well as very detailed metadata for those videos. Can also get data from particular channels.
Instagram & Facebook
Can obtain trending public posts. Be warned the certification process can be quite lengthy here.
Unfortunately there is no public developer program for TikTok at the moment of this writing.
Can fetch trends on anything, also for any region. Be warned: making sense out of this data can sometimes be a little tough, since all data points are relative to each other.
At the time of this writing we are still applying for this one, we’ll keep you updated. 😉
Well this is where the fun part starts, isn’t it? I can only guess what kind of use cases you have mind. Don’t hesitate to get in touch if you want to reflect them with us. For the story at hand we quickly realized that the next hard question after getting access to the data is figuring out what data to get. It is normally not realistic to fetch everything and figure out what to do with it afterwards: Realistically speaking none of the social media networks will allow you to do that. Some guiding question might be:
To close things out here’s a visual impression from one of our data pipelines. In that particular case we were interested in very local trends, for a subject that is typically not trending every day. We also applied some of our ML Power to the data, by adding to each piece of social media content the sentiment, what kind of domain it relates to (politics, media, entertainment, etc.) as well if it contains any known entities. Finally we measured impact for the trending topics through its metadata (likes, shared, views, etc.) and were able to forecast future developments for that impact. Interestingly, even for very few days we could already surface relevant trends in multiple industries: such as the extremely successful spiderman trailer when it came out, a new type of fries pushed by influencers and many more. Truly exciting what this kind of tech can enable!