Social media mining

Social media mining is the process of representing, analyzing, and extracting actionable patterns and trends from raw social media data . The term “mining” is an analogy to the resource extraction process of miningfor rare minerals. Resource extraction mining requires mining companies to sift through vast quanities of raw minerals; Likewise, social media “mining” requires human data analytics and automated software programs to sift through massive amounts of raw social media data (eg, social media usage, online behavior, sharing of content, connections between individuals, online buying behavior, etc.). ) in order to discern patterns and trends. These and other methods (or, for companies, new products, processes and services).

Social media mining uses a range of basic concepts from computer science , data mining , machine learning and statistics . Social media miners develop algorithms for the investigation of massive social media files. Social media is based mining are theories and methodologies from social network analysis , network science , sociology , ethnography , optimization and mathematics. It encompasses the tools to formally represent, measure, model, and mine meaningful patterns from wide-scale social media data. [1] In the 2010s, major corporations, and more-for-profit organizations engage in social media mining to find out more about key populations of interest, which, depending on the organization carrying the mining, may be customers, customers, or citizens.

Background

As defined by Kaplan and Haenlein, [2] social media is the “group of internet-based applications that build on the ideological and the foundations of Web 2.0, and that allow the creation and exchange of user-generated content.” There are many categories of social media including, but not limited to, social networking (Facebook or LinkedIn), microblogging (Twitter), photo sharing (Flickr, Photobucket, or Picasa), news aggregation (Google reader, StumbleUpon, or Feedburner), video sharing (YouTube, MetaCafe), livecasting (Ustream or Twitch.tv), virtual worlds (Kaneva), social gaming (World of Warcraft), social search (Google, Bing, or Ask.com), and instant messaging (Google Talk , Skype, or Yahoo messenger).

The first social media website was introduced by GeoCities in 1994. It enabled users to create their own homepages with sophisticated knowledge of HTML coding. The first social networking site, SixDegree.com, were introduced in 1997. Since then, many other social media sites have been introduced, each providing service to millions of people. These individuals form a virtual world in which individuals (social atoms), entities (content, sites, etc.) and interactions (between individuals, between entities, between individuals and entities) coexist. Social norms and human behavior govern this virtual world. By understanding these social norms and models of human behavior and combining them with the observations and measurements of this world, one can systematically analyze and mine social media. Social media mining is the process of representing, analyzing, and extracting meaningful patterns from social media, resulting from social interactions. It is an interdisciplinary field encompassing techniques from computer science, data mining, machine learning, social network analysis, network science, sociology, ethnography, statistics, optimization, and mathematics. Social media mining faces big challenges such as the big data paradox, an adequate sufficient samples, the noise removal fallacy, and evaluation dilemma. Social media mining represents the world of social media in a computable way, it measures, and designs that can help us understand its interactions. In addition, social media analysis provides information on the subject, analysis and dissemination of information, study and homophily. Social media mining faces big challenges such as the big data paradox, an adequate sufficient samples, the noise removal fallacy, and evaluation dilemma. Social media mining represents the world of social media in a computable way, it measures, and designs that can help us understand its interactions. In addition, social media analysis provides information on the subject, analysis and dissemination of information, study and homophily. Social media mining faces big challenges such as the big data paradox, an adequate sufficient samples, the noise removal fallacy, and evaluation dilemma. Social media mining represents the world of social media in a computable way, it measures, and designs that can help us understand its interactions. In addition, social media analysis provides information on the subject, analysis and dissemination of information, study and homophily.

Uses

Social media mining is used across several industries including business development, social science research, health services, and educational purposes ( [3] , [4] ). Once the data has been received by social media analytics , it can be applied to these various fields. Often, they use the patterns of connectivity that pervade social networks, such as assortativity – the social similarity between users that are induced by influence, homophily, and reciprocity and transitivity ( [5] ). These forces are then measured by statistical analysis of the nodes and connections between these nodes ( [6] ). Social analytics also uses sentiment analysisbecause social media users are often positive or negative in their posts ( [7] ). This provides important social information about users’ emotions on specific topics ( [8] ).

These three patterns have several uses beyond pure analysis. For example, influence can be used to determine the most influential user in a particular network ( [9] ). Companies would be interested in this information in order to decide who they may hire to influence marketing . These influencers are determined by recognition, activity generation, and novelty – which can be measured by these sites ( [10] ). Analysts also value measures of homophily: the tendency of two similar individuals to become friends ( [11] ). Users have begun to rely on other users’ opinions in order to understand various subject matter ( [12]These analyzes can also help create recommendations for individuals in a tailored capacity ( [13] ). By measuring influence and homophily, online and offline companies are able to offer specific products for consumers, and groups of consumers. Social media networks can use this information to their users and to interact with others.

Research

Research areas

  • Social media event detection – Social networks enable users to discuss and share their recent news. As a result, they can be seen as a viable source of information to understand the current emerging topics / events. [14] [15] [16] [17] [18]
  • Community structure (Community Detection / Evolution / Evaluation) – Identifying communities on social networks, how they evolve, and identifying communities, often without ground truth. [1]
  • Network measures – Measuring centrality, transitivity, reciprocity, balance, status, and similarity in social media. [1]
  • Network models – Simulate networks with specific characteristics. Examples include random templates (ER models), Preferential attachment models, and small-world models. [1]
  • Waterfall Information – Analyzing how information propagates in social media sites. Examples include herd behavior, cascade information, diffusion of innovations, and epidemic models. [1]
  • Influence and homophily – Measuring network assortativity and measuring and modeling influence and homophily. [1]
  • In social media – recommending friends or social media sites. [1] [19] [20]
  • Social search – Searching for information on the social web. [21]
  • Sentimental analysis in social media – Identifying collectively subjective information, eg positive and negative, from social media data. [22] [23]
  • Social spammer detection – Social detection spammers who send out unwanted spam to social networks and any other website with user-generated content to targeted users, often corroborating to boost their social influence, legitimacy, credibility. [24] [25] [26] [27]
  • Feature selection with social media data – Transforming feature selection to harness the power of social media. [28] [29] [30] [31]
  • Trust in social media – Studying and understanding of trust in social media. [32] [33] [34] [35]
  • Distrust and negative links – Exploring negative links in social media. [36] [37] [38]
  • Role of social media in crises – Social media is continuing to play an important role during crises, particularly Twitter. [39] Studies show that it is possible to detect earthquakes [40] and rumors [41] using tweets published during crisis. Developing tools to help first responders to analyze tweets Towards better crisis response [42] and Developing technologies to Provide Them faster access to relevant tweets [43] is an active area of research.
  • Location-based social network mining – Mining Human Mobility for Personalized POI Recommendation on Location-based Social Networks. [44] [45] [46] [47] [48] [49]
  • Provenance of information in social media – Provenance . Social media can help in identifying the origin of information due to its unique features: user-generated content, user profiles, user interactions, and spatial or temporal information. [50] [51]
  • Vulnerability management – A user’s vulnerability on a social networking site can be managed, (2) quantifying or measuring a user’s vulnerability, and (3) reducing or mitigating em. [52]

Publication coming

Social media mining research articles published in science, social science, and data mining conferences and journals:

Conferences

Conference on Information and Knowledge Mining (KDD), World Wide Web (WWW), Conference on Information and Knowledge Management (CIKM), International Conference on Data Mining (ICDM), Association for Computational Linguistics (ACL) , Internet Measuring Conference (IMC).

  • KDD Conference – ACM SIGKDD Conference on Knowledge Discovery and Data Mining
  • WWW Conference – International World Wide Web Conference
  • WSDM Conference – ACM Conference on Web Search and Data Mining
  • CIKM Conference – ACM Conference on Information and Knowledge Management
  • ICDM Conference – IEEE International Conference on Data Mining
  • Association for Computational Linguistics (ACL)
  • Internet Measuring Conference (IMC)
  • International Conference on Weblogs and Social Media (ICWSM)
  • International Conference on Web Engineering (ICWE)
  • The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECML / PKDD),
  • International Joint Conferences on Artificial Intelligence (IJCAI),
  • Association for the Advancement of Artificial Intelligence (AAAI),
  • Recommender Systems (RecSys)
  • Computer-Human Interaction (CHI)
  • Social Behavioral Computing-Cultural Modeling and Prediction (SBP).
  • HT Conference – ACM Conference on Hypertext
  • SDM Conference – SIAM International Data Mining Conference ( SIAM )
  • PAKDD Conference – The annual Pacific-Asia Conference on Mining Knowledge and Data Mining

Journals

  • DMKD Conference – Research Issues on Data Mining and Knowledge Discovery
  • ECML-PKDD Conference – European Conference on Machine Learning and Principles
  • IEEE Transactions on Knowledge and Data Engineering (TKDE),
  • ACM Transactions on Knowledge Discovery from Data (TKDD)
  • ACM Transactions on Intelligent Systems and Technology (TIST)
  • Social Network Analysis and Mining (SNAM)
  • Knowledge and Information Systems (KAIS)
  • ACM Transactions on the Web (TWEB)
  • World Wide Web Journal
  • Social Networks
  • Internet Mathematics
  • IEEE Intelligent Systems
  • SIGKDD Exploration.

Social Media Mining is also present on many data management / database conferences such as the ICDE Conference , SIGMOD Conference and International Conference on Very Large Data Bases .

See also

Methods
  • Text mining
Application domains
  • Web mining
  • Twitter mining
Companies
  • NUVI
Related topics
  • Social media
  • Profiling (information science)
  • Web scraping

References

  1. ^ Jump up to:g Zafarani, Reza; Abbasi, Mohammad Ali; Liu, Huan (2014). “Social Media Mining: An Introduction” . Retrieved 15 November 2014 .
  2. Jump up^ Kaplan, Andreas M .; Haenlein, Michael (2010). “Users of the world, unity, challenges and opportunities of social media”. Business Horizons .
  3. Jump up^ Zafarani, R., Ali Abbasi, M., Liu, H., (2014). Social Media Mining. Cambridge University Press. http://dmml.asu.edu/smm.
  4. Jump up^ Singh, A., (2017). “Mining of Social Media Data of University Students.” Education and Information Technologies, 22: 1515-1526.
  5. Jump up^ Tang, J., Chang, Y., Aggarwal, C., Liu, H., (2016). “A Survey of Signed Network Mining in Social Media”. ACM Computing Surveys, 49: 3.
  6. Jump up^ Zafarani, R., Ali Abbasi, M., Liu, H., (2014). Social Media Mining. Cambridge University Press. http://dmml.asu.edu/smm.
  7. Jump up^ Adedoyin-Olowe, M., Gaber, M., & Stahl, F., (2013). “A Survey of Data Mining Techniques for Social Media Analysis.”
  8. Jump up^ Laeeq, F., Nafis, T., & Beg, M. (2017). “Sentimental Classification of Social Media using Dating Mining.” International Journal of Advanced Research in Computer Science, 8: 5.
  9. Jump up^ Zafarani, R., Ali Abbasi, M., Liu, H., (2014). Social Media Mining. Cambridge University Press. http://dmml.asu.edu/smm.
  10. Jump up^ Zafarani, R., Ali Abbasi, M., Liu, H., (2014). Social Media Mining. Cambridge University Press. http://dmml.asu.edu/smm.
  11. Jump up^ Tang, J., Chang, Y., Aggarwal, C., Liu, H., (2016). “A Survey of Signed Network Mining in Social Media”. ACM Computing Surveys, 49: 3.
  12. Jump up^ Adedoyin-Olowe, M., Gaber, M., & Stahl, F., (2013). “A Survey of Data Mining Techniques for Social Media Analysis.”
  13. Jump up^ Zafarani, R., Ali Abbasi, M., Liu, H., (2014). Social Media Mining. Cambridge University Press. http://dmml.asu.edu/smm.
  14. Jump up^ Zarrinkalam, Fattane; Bagheri, Ebrahim (2017). “Event identification in social networks” . Encyclopedia with Semantic Computing and Robotic Intelligence . 01 (01): 1630002. doi : 10.1142 / S2425038416300020 .
  15. Jump up^ Nurwidyantoro, A .; Winarko, E. (June 1, 2013). “Event detection in social media: A survey”. International Conference on ICT for Smart Society : 1-5. doi : 10.1109 / ICTSS.2013.6588106 .
  16. Jump up^ “Event Detection from Social Media Data” (PDF) . Retrieved 5 May 2017.
  17. Jump up^ “Event Detection in Social Media Data” (PDF) . Retrieved 5 May 2017 .
  18. Jump up^ Cordeiro, Mário; Gama, João (January 1, 2016). “Online Social Networks Event Detection: A Survey”. Solving Large Scale Learning Tasks. Challenges and Algorithms . Springer International Publishing. pp. 1-41.
  19. Jump up^ Tang, Jiliang; Tang, Jie; Liu, Huan (2014). “Recommendation in Social Media – Recent Advances and New Frontiers” . In Proceedings of the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining .
  20. Jump up^ Tang, Jiliang; Hu, Xia; Liu, Huan (2013). “Social Recommendation: A Review” (PDF) . Social Network Analysis and Mining .
  21. Jump up^ Horowitz, Damon; Kamvar, Sepandar (2013). “The Anatomy of a Large Scale Social Search Engine” (PDF) . In Proceedings of the 19th International Conference on World Wide Web, pp. 431-440. ACM, 2010 .
  22. Jump up^ Hu, Xia; Tang, Lei; Tang, Jiliang; Liu, Huan (2013). “Exploiting Social Relationships for Sentiment Analysis in Microblogging” (PDF) . In Proceedings of the 6th ACM International Conference on Web Search and Data Mining .
  23. Jump up^ Hu, Xia; Tang, Jiliang; Gao, Huiji; Liu, Huan (2013). “Unsupervised Sentiment Analysis with Emotional Signals” (PDF) . In Proceedings of the 22nd International World Wide Web Conference .
  24. Jump up^ Hu, Xia; Tang, Jiliang; Zhang, Yanchao; Liu, Huan (2013). “Social Spammer Detection in Microblogging” (PDF) . In Proceedings of the 23rd International Joint Conference on Artificial Intelligence .
  25. Jump up^ Hu, Xia; Tang, Jiliang; Liu, Huan (2014). “Online Social Spammer Detection” (PDF) . In Proceedings of the 28th AAAI Conference on Artificial Intelligence .
  26. Jump up^ Hu, Xia; Tang, Jiliang; Liu, Huan (2014). “Leveraging Knowledge Across Media for Spammer Detection in Microblogging” (PDF) . In Proceedings of the 37th Annual ACM SIGIR Conference .
  27. Jump up^ Hu, Xia; Tang, Jiliang; Gao, Huiji; Liu, Huan (2014). “Social Spammer Detection with Sentiment Information” (PDF) . In Proceedings of the IEEE International Conference on Data Mining .
  28. Jump up^ Tang, Jiliang; Liu, Huan (2012). “Feature Selection with Linked Data in Social Media” (PDF) . In Proceedings of SIAM International Conference on Data Mining .
  29. Jump up^ Tang, Jiliang; Liu, Huan (2014). “Feature Selection for Social Media Data”(PDF) . ACM Transactions on Knowledge Discovery from Data (TKDD), 8 (4) Pages 19: 1-19: 27 .
  30. Jump up^ Tang, Jiliang; Liu, Huan (2012). “Unsupervised Feature Selection for Linked Social Media Data” (PDF) . In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining .
  31. Jump up^ Tang, Jiliang; Liu, Huan (2014). “Unsupervised Feature Selection for Linked Social Media Data” (PDF) . IEEE Transactions on Knowledge and Data Engineering (TKDE), 26 (12): 2914-1927 .
  32. Jump up^ Tang, Jiliang; Liu, Huan (2014). “Trust in Social Computing” . In Proceedings of the 23rd International World Wide Web Conference .
  33. Jump up^ Tang, Jiliang; Gao, Huiji; Liu, Huan (2012). “mTrust: Discerning Multi-Faceted Trust in a Connected World” (PDF) . the 5th ACM International Conference on Web Search and Data Mining .
  34. Jump up^ Tang, Jiliang; Gao, Huiji; DasSarma, Atish; Liu, Huan (2012). “eTrust: Understanding Trust Evolution in an Online World” (PDF) . In Proceedings of ACM SIGKDD International Conference on Knowledge Discovery and Data Mining .
  35. Jump up^ Tang, Jiliang; Gao, Huiji; Hu, Xia; Liu, Huan (2013). “Exploiting Homophily Effect for Trust Prediction” (PDF) . the 6th ACM International Conference on Web Search and Data Mining .
  36. Jump up^ Tang, Jiliang; Hu, Xia; Liu, Huan (2014). “Is Distrust the Negation of Trust? The Value of Distrust in Social Media” (PDF) . In Proceedings of ACM Hypertext conference .
  37. Jump up^ Tang, Jiliang; Hu, Xia; Chang, Yi; Liu, Huan (2014). “Predictability of Distrust with Interaction Data” (PDF) . ACM International Conference on Information and Knowledge Management .
  38. Jump up^ Tang, Jiliang; Chang, Shiyu; Aggarwal, Charu; Liu, Huan (2015). “Negative Link Prediction in Social Media” (PDF) . In Proceedings of ACM International Conference on Web Search and Data Mining .
  39. Jump up^ Bruno, Nicola (2011). “Tweet first, check later?” Oxford: Reuters Institute for the Study of Journalism, University of Oxford . 10 : 2010-2011.
  40. Jump up^ Sakaki, Takashi; Okazaki, Makoto; Yutaka, Matsuo (2010). “Earthquake shakes Twitter users: real-time event detection by social sensors”. Proceedings of the 19th International Conference on World Wide Web : 851-860.
  41. Jump up^ Mendoza, Marcelo; Poblete, Barbara; Castillo, Carlos (2010). “Twitter under crisis: Can we trust what we RT?” Proceedings of the first workshop on social media analytics : 71-79.
  42. Jump up^ Kumar, Shamanth; Barber, Geoffrey; Abbasi, Mohammad Ali; Liu, Huan (2011). “TweetTracker: An Analysis Tool for Humanitarian and Disaster Relief” . The 5th International AAAI Conference on Weblogs and Social Media . Retrieved 1 December 2014 .
  43. Jump up^ Kumar, Shamanth; Hu, Xia; Liu, Huan (2014). “A behavior analytics approach to identifying tweets from crisis regions”. Proceedings of the 25th ACM Conference on Hypertext and Social Media : 255-260.
  44. Jump up^ Gao, Huiji; Tang, Jiliang; Liu, Huan (2012). “Exploring Social-Historical Ties on Location-Based Social Networks” (PDF) . In Proceedings of the Sixth International AAAI Conference on Weblogs and Social Media .
  45. Jump up^ Gao, Huiji; Tang, Jiliang; Liu, Huan (2012). “Mobile Location Prediction in Spatio-Temporal Context” (PDF). Nokia Mobile Data Challenge Workshop 2012.
  46. Jump up^ Gao, Huiji; Tang, Jiliang; Liu, Huan (2012). “gSCorr: Modeling Geo-Social Correlations for New Check-ins on Location-Based Social Networks”(PDF) . In Proceedings of the 21st ACM International Conference on Information and Knowledge Management .
  47. Jump up^ Gao, Huiji; Tang, Jiliang; Hu, Xia; Liu, Huan (2013). “Exploring Temporal Effects for Rental Recommendation on Location-Based Social Networks”(PDF) . In Proceedings of the 7th ACM Recommender Systems Conference .
  48. Jump up^ Gao, Huiji; Tang, Jiliang; Hu, Xia; Liu, Huan (2014). “Content-Aware Point of Interest Recommendation on Location-Based Social Networks” (PDF) . In Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence.
  49. Jump up^ Gao, Huiji; Tang, Jiliang; Liu, Huan (2014). “Personalized Location Recommendation on Location-Based Social Networks” (PDF) . In Proceedings of the 8th ACM Recommender Systems Conference .
  50. Jump up^ Barbier, Geoffrey; Feng, Zhuo; Gundecha, Pritam; Liu, Huan (2013). “Provenance Data in Social Media” . Synthesis Readings on Data Mining and Knowledge Discovery .
  51. Jump up^ Gundecha, Pritam; Feng, Zhuo; Liu, Huan (2013). “Seeking Provenance of Information in Social Media” (PDF) . In Proceedings of the 22nd ACM International Conference on Information and Knowledge Management Conference .
  52. Jump up^ Gundecha, Pritam; Barber, Geoffrey; Tang, Jiliang; Liu, Huan (2014). “User Vulnerability and its Reduction on a Social Networking Site” (PDF) . Journals of Transactions on Knowledge Discovery from Data .