Trending topics in VLDB, via key words in the titles of publications from 1975 to 2013.
Last year's analysis and commentary is at vldb2012.html.
The more interesting cluster analysis is at index.html.
Thanks to @samrmadden for the 2000 - 2011 titles.
Thanks to DBLP for the rest!
If you want, the data and scripts are on GitHub.
Thees trends are based on the keywords across all years (1975 - now) of VLDB publications
As we would expect, the most consistently popular keywords are all tightly related to our expertise as a community. databases, indexes, joins, models, and management are all at the core of what we do.
Here, we look for keywords that have an overall upward trend.
It's interesting that queries, indices, and efficiency are new in the past two decades. We can also see the web burst onto the scene, bringing search along with it. We see the excitement, then decline of streaming. Distributed processing is on the rise in parallel with "big data".
On the long time scale, database, models and relational as keywords are dying!
We can see that object oriented, rules, activity came and went. We always talk about XML being dead, at it sure looks that way on paper, along with streams, services, and caching.
Let's now look at keywords that have burst into the scene in since I started graduate school (2007). These keywords are selected by computing the ratio of "the average number of times a keyword is used since 2010" by "the average before 2010".
Mapreduce, scaling, and the cloud are still at the peak of the gartner hype cycle, and there are so many systems that it's hard to even compare them.
Web data is inherently uncertain so we need probabilitc techniques to search for similar data.
Finally it's nice to see that crowdsourcing is still trending upwards. Perhaps work with a more human and social angle is ready for the lime light.
It's good to see that data, query and databases are never far from our minds.
The topics I've decided to work on look like a mixed bag. Provenance or lineage seems to be down, but I believe in second chances. Good to know that data analysis and workload driven research is a stable and increasingly topic!`