Ajeet R. Pathak*, Manjusha Pandey and Siddharth Rautaray Pages 394 - 402 ( 9 )
Background: The large amount of data emanated from social media platforms need scalable topic modeling in order to get current trends and themes of events discussed on such platforms. Topic modeling play crucial role in many natural language processing applications like sentiment analysis, recommendation systems, event tracking, summarization, etc.
Objectives: The aim of the proposed work is to adaptively extract the dynamically evolving topics over streaming data, and infer the current trends and get the notion of trend of topics over time. Because of various world level events, many uncorrelated streaming channels tend to start discussion on similar topics. We aim to find the effect of uncorrelated streaming channels on topic modeling when they tend to start discussion on similar topics.
Methods: An adaptive framework for dynamic and temporal topic modeling using deep learning has been put forth in this paper. The framework approximates online latent semantic indexing constrained by regularization on streaming data using adaptive learning method. The framework is designed using deep layers of feedforward neural network.
Results: This framework supports dynamic and temporal topic modeling. The proposed approach is scalable to large collection of data. We have performed exploratory data analysis and correspondence analysis on real world Twitter dataset. Results state that our approach works well to extract topic topics associated with a given hashtag. Given the query, the approach is able to extract both implicit and explicit topics associated with the terms mentioned in the query.
Conclusion: The proposed approach is a suitable solution for performing topic modeling over Big Data. We are approximating the Latent Semantic Indexing model with regularization using deep learning with differentiable ℓ1 regularization, which makes the model work on streaming data adaptively at real-time. The model also supports the extraction of aspects from sentences based on interrelation of topics and thus, supports aspect modeling in aspect-based sentiment analysis.
Big data, temporal topic modeling, online latent semantic indexing, deep learning, data analysis, sentiment analysis.
School of Computer Engineering, Kalinga Institute of Industrial Technology (KIIT) University, Bhubaneswar 751024, School of Computer Engineering, Kalinga Institute of Industrial Technology (KIIT) University, Bhubaneswar 751024, School of Computer Engineering, Kalinga Institute of Industrial Technology (KIIT) University, Bhubaneswar 751024