Tim Finin, 1:20pm 30 December 2008
Akshay Java defended his PhD dissertation this fall on discovering communities in social media systems and the submitted version is now available online. Akshay is now a scientist at Microsoft’s Live Labs. The citation, link and abstract are below.
Akshay Java, Mining Social Media Communities and Content, Ph.D. Dissertation, Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, December 1, 2008. Available at http://ebiquity.umbc.edu/paper/html/id/429/Mining-Social-Media-Communities-and-Content.
Social Media is changing the way people find information, share knowledge and communicate with each other. The important factor contributing to the growth of these technologies is the ability to easily produce “user-generated content”. Blogs, Twitter, Wikipedia, Flickr and YouTube are just a few examples of Web 2.0 tools that are drastically changing the Internet landscape today. These platforms allow users to produce and annotate content and more importantly, empower them to share information with their social network. Friends can in turn, comment and interact with the producer of the original content and also with each other. Such social interactions foster communities in online, social media systems. User-generated content and the social graph are thus the two essential elements of any social media system.
Given the vast amount of user-generated content being produced each day and the easy access to the social graph, how can we analyze the structure and content of social media data to understand the nature of online communication and collaboration in social applications? This thesis presents a systematic study of the social media landscape through the combined analysis of its special properties, structure and content.
First, we have developed a framework for analyzing social media content effectively. The BlogVox opinion retrieval system is a large scale blog indexing and content analysis engine. For a given query term, the system retrieves and ranks blog posts expressing sentiments (either positive or negative) towards the query terms. Further, we have developed a framework to index and semantically analyze syndicated1 feeds from news websites. We use a sophisticated natural language processing system, OntoSem, to semantically analyze news stories and build a rich fact repository of knowledge extracted from real-time feeds. It enables other applications to benefit from such deep semantic analysis by exporting the text meaning representations in Semantic Web language, OWL.
Secondly, we describe novel algorithms that utilize the special structure and properties of social graphs to detect communities in social media. Communities are an essential element of social media systems and detecting their structure and membership is critical in several real-world applications. Many algorithms for community detection are computationally expensive and generally, do not scale well for large networks. In this work we present an approach that benefits from the scale-free distribution of node degrees to extract communities efficiently. Social media sites frequently allow users to provide additional meta-data about the shared resources, usually in the form of tags or folksonomies. We have developed a new community detection algorithm that can combine information from tags and the structural information obtained from the graphs to effectively detect communities. We demonstrate how structure and content analysis in social media can benefit from the availability of rich meta-data and special properties.
Finally, we study social media systems from the user perspective. In the first study we present an analysis of how a large population of users subscribes and organizes the blog feeds that they read. This study has revealed interesting properties and characteristics of the way we consume information. We are the first to present an approach to what is now known as the “feed distillation” task, which involves finding relevant feeds for a given query term. Based on our understanding of feed subscription patterns we have built a prototype system that provides recommendations for new feeds to subscribe and measures the readership based influence of blogs in different topics.
We are also the first to measure the usage and nature of communities in a relatively new phenomena called Microblogging. Microblogging is a new form of communication in which users can describe their current status in short posts distributed by instant messages, mobile phones, email or the Web. In this study, we present our observations of the microblogging phenomena and user intentions by studying the content, topological and geographical properties of such communities. We find that microblogging provides users with a more immediate form of communication to talk about their daily activities and to seek or share information.
The course of this research has highlighted several challenges that processing social media data presents. This class of problems requires us to re-think our approach to text mining, community and graph analysis. Comprehensive understanding of social media systems allows us to validate theories from social sciences and psychology, but on a scale much larger than ever imagined. Ultimately this leads to a better understanding of how we communicate and interact with each other today and in future.
Robert Scoble is said to be in the know when it comes to Social Media. He really knows the pulse of the Social Media industry in Scilicon Valley and beyond. In this video, Scobleizer shows you how to be a bigger FriendFeed Monster than Guy Kawasaki.
This is what Scoble covers in the video:
1. Why friendfeed?
2. Get inbound content with the aggregator.
3. Get inbound content via friends.
4. How friend-of-a-friend feature brings more inbound content.
5. Using the everyone tab to get more inbound content.
6. Using rooms to find inbound content.
7. Using “best of” feature to find more inbound content.
8. Using the “me” and “home” pages.
9. Using lists to do friend management.
10. Creating media in friendfeed.
11. Sharing media found on the web.
12. Creating media with email.
13. Deciding between Twitter and friendfeed.
14. Your outbound content, likes.
15. Your outbound content, comments.
16. Your outbound content, send to Twitter.
17. Your outbound content, your stuff.
18. Your outbound content, using rooms.
19. Using search.
20 Using real-time features.
Check out what else Robert had to say here:
20 ways to being a bigger friendfeed monster than Guy Kawasaki
Well the Social Media Landscape is ever changing, just like the skyline in Las Vegas. It seems like there’s a change every year there. Something blown up or something goes up. Such is the same with the Social Media Landscape.
The beginning of December, Pownce announced they will be closing their doors. While I have an account on Pownce and the features are really pretty cool, 99% of all of my updates where done with Twhirl. Basically I used Twhirl to post Tweets on Twitter, and it automatically posted them to Pownce too.
If you have a Pownce account, you can visit pownce.com/settings/export/ to generate an export file. You can then import your posts to other blogging services such as Vox, TypePad, or WordPress. However, you should do this quickly, only a couple days left.
Pownce put up a blog post about exporting, you can check it out here: blog.pownce.com
Now on the other hand, Tumblr just closed it’s Series B round bringing in $4.5Million to leverage the outstanding first year results. Tumblr now has an average of over 15Million unique visitors per month! Those are great results considering the social networking site started in 2007.
I’m sure the 22 year old founder, David Karp, is ecstatic with the latest happenings of the company. Mr. Karp stated, “I feel so fortunate to have investors who believe in us, and a growing worldwide community that is just as excited about this neat thing we’re building…”
Tumblr has plans for their first quarter in 2009. They are planning on launching a premium service which will represent the company’s first revenue generation. It will be followed by a revenue strategy that leverages both Tumblr’s publishing platform and community, while continuing the company’s focus on innovation and invention.
I can’t wait to see what becomes of Tumblr over the next year.
As for Pownce, Rest In Peace, such are the ways of the Social Media Landscape.
That is a question that is asked quite often. What is Social Media?
It’s hard to define sometimes, but this slide show does a great job of tackling the answer.
This was created by Lee White from Durham, NC, who has a consulting business specializing in implementing Social Tools in the enterprise.