voice of humanity: Blogosphere Review and the MetaWeb Concept    
 Blogosphere Review and the MetaWeb Concept15 comments
26 Oct 2003 @ 15:14, by Roger Eaton

This article is an overview of the blogosphere by a newcomer bringing a fresh eye to the scene -- the result of my researches how best to implement the voice of humanity vohP2P middleware program in a blog-friendly way. I am taking ming's comments about making blog metadata available through an interface seriously. The technoblogerati will want more than a metadata interface, though, before they are seriously interested, so it was delightful to stumble on an idea that really might get the voh project moving. I call it the MetaWeb; it is a simple, highly extensible idea for building a peer-to-peer web on top of the web, implementing among other things, the instant blogroll by category and the stigmergic ant trail in a big way - see too this article.

A stroll through the faqs for those who are new to the weblog party is a revelation. (The first thing to know is to say blog, not beelog, and the second thing is that the party might just be getting started.) From 1999 Jorn Barger's Web Resources FAQ looks back at where weblogs came from. Originally, the notion was to log and annotate other websites of interest. People who had their own websites made log entries manually.

A wave of online blog tools, also called blog builders or blog servers, made it possible to maintain a weblog without being very technical or having a website. Here is a September 2003 review of some of the top dog blog builders. From a more up to date perspective thanks to Andreas Ramos we can see that once blogging facilities were available to us masses, we began to blog freestyle. Online diaries have become a popular form and there are weblogs where original articles predominate rather than commentary. "Newslogs" track many stories, from Iraq to fashion. Blogs are generally still link intensive, but what makes a blog a blog is that it is entry driven, with new entries not overlaying the old, just pushing them down the stack eventually to be archived.

In a second breakthrough, the online tools adopted the practice of creating standardized RSS newsfeeds in XML to accompany each weblog. Don't ask what RSS means. Here is what an RSS newsfeed look like. XML is very machinable, so once these feeds were available, programmers began creating widgets for them. The "news aggregator" you set up on your home pc and with it you can subscribe to other people's newslogs. The aggregator automatically checks for new entries for you and displays them nicely in a resume list and maybe lets you click in to see the full article. If you have a list of favorite weblogs, this is the way to go. Another possibility is to display a newsfeed summary for someone else's weblog on your website. Here's one implementation at $15 a year. In addition, you can register your rss newsfeed at numerous sites and have your weblog entries automatically scanned and stored by keyword for others to find. Examples are Syndic8, or NewsIsFree.

Several of the main blog builders also established APIs (application programming interfaces) to connect straight into their blog database, where the webloggers' blog entries are stored, so you could have a client program running on your desktop and write your entries while offline if you want, conveniently, and then blast them all up when you go online instead of having to go through the web interface, slowly, slowly. Plus the same API could be used instead of the RSS newsfeed to aggregate entries.

One big problem with all this is that there are several flavors of RSS and several different API's, and none of them does what everyone wants. Much to their credit, therefore, the technical people in the community have gotten together to design a next-generation API that is certainly going to be adopted by the blog builders. In fairly short order the "atom" API was thoroughly talked over by the interested parties and consensus achieved on the new, improved and extensible design. Interestingly, although one of the RSS flavors was built on RDF (see an earlier article), the atom design team decided RDF was not appropriate for the new API, presumably because it complicates things without any near-term payback. Characteristically, the rejection of RDF was combined with a request to the RDF supporters to please come forward with any design factors that might be useful in adding an RDF extension later. When an either/or design decision was forced, the atom team remained inclusive and conciliatory in spirit. And as it turns out, the atom-->xslt-->rdf formula will likely fill the bill for those who want to work in rdf format.

Something to watch for as atom comes into play, is a period of rapid differentiation between the blog builders based on atom's extensibility. If I am not off the mark, weblog clients will soon be available that store the users' log entries on their home machines. This will make it possible to switch from one blog builder/server to another and carry all one's archives and active logs across to the new server. Therefore, the blog servers will add special features in an attempt to keep their users from leaving and to lure bloggers from other servers over. My guess is that community-building facilities will be important in these new extensions.

In a major move, google has recently acquired the largest blog server, Blogger. This acquisition occurred after Blogger committed to atom, and so far google has not changed course, nor will it, I expect. The weblogger community has no animus against google as far as I can tell and google, 900-pound gorilla physique notwithstanding, doesn't need online antagonists, particularly not of the hacker variety. However there is a tension built into the situation. As the blog servers differentiate, what if someone comes up with a peer to peer design based on atom that brings the server down much closer to the users, so that each blog server would be handling say 1 to 100 clients? Now what if this new peer-to-peer server provides an extension based on its peer-to-peer connection capability that google cannot match? This would be a classic top down vs bottom up duel, and it is not clear, considering that people will be able to vote with their feet by moving their blog from one system to another, who would win. Yes, very likely google, with all its money and other advantages of centralization, such as a "blog this" button on the google toolbar, but not necessarily so. There is a lot of sentiment for bottom up, and that is what google just cannot be, not really, though I expect they will try to mimic the effect.

Another blog development of some consequence is called "trackback" with its offshoot, "Topic Exchange". Trackback allows a weblogger to post a comment on her weblog to another trackback enabled weblog entry and have an introductory snippet (up to 256 characters) automatically displayed under the provoking entry. It is even possible to reply to a reply and so to have a public conversation that can be tracked by a third party tool. Although it is not really very hard for the user, still it can be a bit confusing at first and the snippet that displays is too short to say anything, so trackback has not caught on in a really big way. Not as far as I can tell anyway.

Trackback was designed to enable topical community weblogs sponsored by the blog server. Then individual bloggers that are trackback enabled can post topical comments to both their own weblog and the community weblog for that topic in one go. Replies can be posted three ways, to one's own weblog, to the weblog of the entry being replied to and to the common community weblog for the topic. Good idea, and it is happening to a degree, though again, it has not caught on fully, not yet anyway. Somehow, before long, the desire to aggregate the blogger community by topic and implement cross blog posting will find the technical answer that works.

Finally, in this quick catalog of blogosphere elements, the blogrolling phenom needs to be mentioned. Blogrolling is the practice of linking from a sidebar in one's own weblog to other weblogs that one thinks are particularly worth watching. Then several websites, blogrolling.com for example, which btw also offers a free blogroll widget for instant blogrolling, use the blogroll cross references to list the top 100 blogs – i.e. those that are linked to most often. Taking the blogroll a step further is the daypop search engine with such novel features as wordburst tracking and an improved top 100 list of blogs.

A problem with these top 100 lists is that they do not cover the hundreds of smaller groupings of bloggers, who cross post each other on particular topics. However, Feedster lets you search an extensive blog database and list the results by blogroll rank, which helps sort out groupings of weblogs. (If you search by the phrase "collective intelligence", you come up with a Winer! And that points to How to Decentralize Directories, which is somewhat like the voice of humanity structure that was described in a previous article.

So where does vohP2P, the voice of humanity middleware implementation, fit into this picture?

First of all, vohP2P must make a perfect back end for the perfect atom client/server. Therefore, it must be able to store and retrieve blog entries fast and flexibly.

Additionally, vohP2P should be able to:

2) get-from-web-atom-server, store and feed-to-web-atom-server all entries, including archives, for any given blog, using whatever specialized code is necessary, blogbuilder by blogbuilder. This is to implement the easy transfer of a blog from one blog server to another.

3) form a peer-to-peer group with other vohP2P instances.

4) use BitTorrent to flood a peer-to-peer group with large new files.

5) allow a specialized peer group gateway to pass off requests for your weblog to your atom server if you are online and to another online peer group server if you are offline. (Is this possible, sensible?)

6) store weblog owner information such as name, email address, age, sex and so forth.

7) keep track of a server owner controlled directory hierarchy - including adding new folders, moving folders, deleting folders, tracking folder moderators and so forth. This will implement flexible categories for blogs through an extension to the atom API.

8) allow a folder moderator to link that folder to a folder in another vohP2P server as described in a previous article, with upload/download rules.

9) provide an entry to the MetaWeb. The simple but oh so powerful idea here -- has no one else thought of it? -- is to have a cgi server that mediates between the ultimate user and the web. The MetaWeb will act like a website that captures a user and links out to the web for the user in a frame. But we don't want to use frames, of course. Instead we will wrap the page from the web with a header and a footer that we provide. A simple rule for using a url as a cgi parameter will invoke vohP2P in its MetaWeb server capacity.

The first link into the MetaWeb will likely be to the vohP2P owner's own weblog. vohP2P will then fetch the requested weblog page over the internet, wrap the page with a footer and header and change all the page links so they go back through the same vohP2P program instead of directly to the requested pages. In the header will be a MetaWeb address bar so the user can remain in the MetaWeb instead of using the browser's address bar. In the footer, we add navigation to the vohP2P directory for the current weblog owner, and to other vohP2P instances as designated by the owner. A javascript program lets us rate the page we are on for interest by pressing any numeric key 1 to 9, or to vote y or n on the page for yes/no, thumbs up/thumbs down, and the server stores this information. (Javascript can do this, right?) A 'b' blogrolls the page by the current category, also adding the page to the current category item list.

Of course we will want to mark the link trail stigmergically, and that means either a color scheme for links or possibly the adding of small gifs next to each link to indicate the average rating received by the page linked to. A great feature will be the ability to link into the MetaWeb from different communities, using rating averages from the current community to color code the links. Better yet, we will be able to switch communities without even leaving the MetaWeb by hopping to a different vohP2P server.

10) having the MetaWeb, with its ratings by known webloggers who have provided user information, we can apply the intelligent search capabilities of a previous article and build up the blog meta-data which we will share with other systems.

That has to be it for this article. There are loose ends here, to be sure. Next up is a series of articles about the bloggers who are in the conceptual neighborhood.

[< Back] [voice of humanity]



27 Oct 2003 @ 16:41 by ming : MetaWeb
I'm trying to wrap my mind around the core of this.

So, you're kind of hinting at that there's a standardized way of trafficking weblog types of data and metadata about it. And if there isn't a standardized way (like a universal agreement on an RSS or Atom interface), a middleware software can be the expert at pulling and pushing any common formats that are out there, and converting them to each other, and adding metadata to the mix.

So, there's the possibility that the weblogs actually were hosted by a cloud, like a bittorrent p2p kind of thing, where you don't really care much where the data actually is stored. And that might well make sense if and when the extra metadata layers start becoming inexpendable for everybody. But there isn't necessarily a compelling reason for it today based on traffic alone. Even the most popular weblogs today can probably be hosted fine in a regular hosting account, without worrying too much about the cost.

I should mention Userland's Radio, which is one of the top weblog programs. The data is stored on one's own disk, but then mirrored on to the server when one posts. It gives the user the feeling that they control the data and they have it right here in front of them. Which might be an important feature, just feeling-wise. I want to control my own data, at least the basic raw version of it. Not the meta-data that others add. So, are you thinking, maybe, that we might skip the traditional hosting link in the chain? You store the original data in your own machine, and then post it into a p2p cloud, rather than to a particular server?

But, yes, I want to see the web through my own MetaWebNavigator, which knows something about my preferences and previous history, and which uses the preferences and ratings and history of my friends and favorite groups.

If the metadata (ratings, comments, links, etc) becomes inexpendable enough, it could very well become inconceivable to want to surf the web without it.

Like, if I was shopping in the supermarket and I had a {link:http://ming.tv/flemming2.php/_v10/__show_article/_a000010-000412.htm|device} that allowed me to instantly and easily see all possible useful references to a given product - what previous buyers thought about it, what consumer analysis has shown, the environmental record of the company, and who owns it - I would never go back to just looking at the packaging again. It would be a no-brainer that I'd want full disclosure, even if it took me 10 seconds longer to evaluate each product with it than without it. So, I'd say it is the same way with anything on the web. If I could easily and quickly get a good picture of what I'm looking at, where it fits in my picture of the world, what quality it has been found to have, how useful it has been to others, how many others have looked at it, etc. - of course that's what I would choose almost all the time.  

27 Oct 2003 @ 16:44 by ming : JavaScript
Actually I don't think there's an elegant way of letting JavaScript take keystrokes, unless it involves some kind of form field. Unless there's something I'm missing. Flash or Java can, of course.  

28 Oct 2003 @ 16:49 by ming : Atom
Now, the problem with an engine that can import and export to all sorts of formats is of course that the different feed formats (RSS) have different ideas of what the model of a blog posting is. Atom seems indeed to be the best attempt at including a complete model. But then most likely all the RSS formats will be missing something. Of course it is still possible to convert any feed format to any other format, if we're ok with leaving some things out.

Anyway, it seems to be like a good place to start would be to come up with a standardized way of adding meta data to the stuff you can pick up from feeds. There's already a number of sites, like Technorati, newsisfree, syndic8, that are good at picking up all sorts of things, and which add their own metadata to it. But they don't really provide any standard way of picking up the extra stuff as far as I know.  

28 Oct 2003 @ 18:13 by mre : reply to ming
Thanks for taking the trouble to read the article through carefully, ming.

Rethinking, I see there is a problem with moving one's blog from one site to another -- the permalinks are lost, or else the entries are duplicated on the new site. Also any template/formatting from the previous site will be lost. I still favor having one vohP2P module that can extract the current entries of a weblog from its home site and another to push those entries to any {link:http://www.intertwingly.net/wiki/pie/FrontPage|atom} enabled site. If google, aol or yahoo decide not to go with atom, then we will need specialized code for each of them -- on the extraction end only! It will help build voh critical mass if it is easy for people to move their blogs into a voh enabled site, and it is only fair that we provide the capability to move the blogs the other way, too, out of the voh cloud to any site that supports the community developed and approved atom standard. But there is no need to support specialized API's for posting.

On the bittorrent point, we can leave it for later. There is also the {link:http://open-content.net/|Open Content Network}, which looks great -- see the knowledgeable and (almost) readable paper on the {link:http://open-content.net/specs/draft-jchapweske-caw-03.html|Content Addressable Web} by Justin Chapweske.

On skipping the hosting link in the weblog chain, and posting to a p2p cloud, that is the idea, but I am a little hazy on just how it will work. Of course it is no problem if you have a static web address and are always online, but for the rest of us, how do you have a permalink to a p2p cloud? A permalink has to be accessible to anyone that just has a browser and no special plugin or anything. Seems like we have to come back to having a hosting link, though the host now would just be a connector to the cloud and would not actually have the data itself.

Your {link:http://ming.tv/flemming2.php/_v10/__show_article/_a000010-000412.htm|portable product truth device} could well be down this road, just around the corner!

I'm going take a month or so to post articles about what others are doing that might be related to the voh goals, see if I can get some more feedback. Then I think I'll code the MetaWeb as a way of getting started. That should be relatively easy yet quite exciting and perhaps in the process we can find someone to take on the atom modules. The MetaWeb will give us an easy way for people to add meta data - ratings and keywords to begin with - to blog entries, and for that matter to other websites. We should try to keep the original effort on track, tho, targeting weblogs not the entire web. When we make metadata available, it will be as xml over http. The exact format is not clear to me yet; it will come into focus as the MetaWeb effort gets real. If you have specific ideas, please let me know.

Your comments always help.

(btw, I am reading {link:http://www.hrc.wmin.ac.uk/theory-collectiveintelligence.html|Pierre Levy's Collective Intelligence} and find it slow but likeable.)  

28 Oct 2003 @ 18:27 by ming : MetaWeb
Now, if we assumed that weblogs remained hosted whichever way they are, and the issue just is how to store and exchange metadata about them... And we assumed that each posting had a permalink which really was permanent, then it might simply be an XML structure which includes the permalink as one of the fields. And other fields would be rating, keywords, and who picked them, and when. And that XML data could be stored any which way, as long as it is discoverable, just like RSS. And there's a slightly different format for a summary record that already has added some things up.

An assortment of programs might produce those xml structures. The functionality might be built into weblog programs, or into aggregators, or into browsers, or all of the above. And big programs and sites will go find all the data and aggregate it. And smaller programs on one's desktop might spider around for the data that is of interest for a particular individual.

What do you have in mind working in? Python?  

29 Oct 2003 @ 08:12 by mre : re: MetaWeb
That sounds good, but I just don't know enough at this point to say how to proceed. The metadata structure needs to accommodate hierarchy as well as ratings and keywords. Creating that hierarchy from the bottom up is a fundamental to the voh project, so it makes sense from the voh point of view. My thought, not fully developed at this point, was that MetaWeb entrances would correspond to particular leaf node categories in the voh hierarchy, and that tagged weblog entries would be imported into the current category. Ratings and keywords would be in the context of the current category. The same weblog entry might be tagged in more than one category and rated/keyworded quite differently in the different context.

I am planning to use Python and MySQL. The database access will be modularized so other databases can be used instead.  

29 Oct 2003 @ 08:39 by ming : Proceeding
Hm, yeah, the hierarchy is the hard part. But what if that problem were separated out a bit. So, there'd be the XML structure referring to a blog permanlink, plus fields for ratings and keywords, and what if, for categorization into a hierarchy it would refer to a black box categorization engine elsewhere. I.e. the basic XML structure doesn't have to solve the problem of how to coordinate and aggregate all those categories into useful hierarchies, if it is just able to contain several categories and/or link to multiple categories elsewhere.

So, if there is just a way of connecting up with a categorization engine and saying, this post fits right there and there. Then the problem of how to aggregate those can be addressed separately.

So, as I said before, I think the most viable strategy would be to divide this into several different black boxes, with a clear interface between them, and be open to that somebody else might come along and do one of those boxes better than you can. There's the metadata XML layout, there's a program to fill it in, there's an engine that safeguards certain categories or hierarchies, there's an aggregator that spiders arond and gathers the data, and there's maybe a separate aggregator for categories, that figures out how categories from different places fit into other categories or how they're equivalent. And potentially there might be several of each of these, done by different people, so one has a choice, as long as there are some basic agreements they all share. Which, I think, might just be the metadata XML, Atom and/or RSS, and probably, hopefully, your clever concept of how categories scale and aggregate.  

29 Oct 2003 @ 09:03 by mre : oops -- crossed wires
Your reply to my previous was clearly written and posted as I re-edited and reposted what I had said! The black box easy way out for hierarchies doesn't work when you realize that ratings and keywords must be in the context of the current category. At least I don't see how it can work.  

29 Oct 2003 @ 10:20 by mre : rethinking
On further consideration, it should be ok to separate the categories from the other metadata. When we know the category(s) of an item, then those categories can be included as keywords in the metadata section, perhaps even as "category-keywords" to distinguish them from regular keywords. But if a category has not been assigned, then the ratings and keywords are still valid metadata.

I was wrong to think that a rating by a particular person on a particular weblog entry is in context of a category. That rating is for the entry regardless of category. Different people might frequent different categories, so the same entry might indeed get different marks depending on category, but a particular person would not rate the entry multiple times, once for each category.

So great, we are on the same wavelength! I had myself worried for a minute.

29 Oct 2003 @ 13:45 by ming : Ratings in context
You do have a point there, however. It could very well be it makes sense to give a rating in relation to a certain category. Which will simply be the rating number and a pointer to the category it is in relation to. It probably doesn't make sense for all kinds of ratings, but it certainly would make sense for a Relevance rating.

Which of course brings up... What kind of ratings DO you plan on? Quality, agreement, relevance, entertainment value? Or should that be something extensible?  

30 Oct 2003 @ 07:17 by mre : continuing...
It's hard to keep the whole picture in mind at once. Relevance ratings are a special case where the item/category are being rated as a pair. It is an irresistible idea. vohInterMix can handle it and will, but how to handle it in general, I don't see.

The voh concept requires that "interest" ratings (1-9, say) and a Yes/No vote be built into the system as univeral core ratings. Interest ratings should always reflect the personal interest of the rater, and we will use those ratings to feedback material automatically to the rater and for all the other auto-categorization methods outlined previously. The Yes / No vote reflects the voter's approval or disapproval where social issues are involved, or agreement / disagreement where more intellectual issues are involved.

As I see it, the keywords flexibly take the place of all other ratings except relevance. An entertaining item will have the keyword "entertaining" more often applied. The "interest" rating applies fairly universally, while the other possible ratings are more specialized, and I would rather stay away from the complications they bring. Relevance also applies universally, but only in the category context.  

30 Oct 2003 @ 07:41 by ming : Rating by Category
OK, so there's the possibility that all sorts of ratings might be done in the shape of referring to a category defined elsewhere. Someone might define five levels of entertainment value, and simply by linking to which one I think a certain item belongs in, I'm both rating and categorizing it. And you don't have to think about all the possibible ways of rating something up front if you do that.  

13 Feb 2004 @ 19:03 by Roger Eaton @ : ratings for reviews are different
Most items will be rated for what they are: articles, images, audio and visual clips. Items in lists of movies or books and the like will be rated for the item they point to, i.e. for the movie or book. Reviews also point to an item, but they are rated EITHER for themselves OR for their usefulness as a pointer.

Both distinctions are important enough that they need to be addressed in the rating system. That is, 1) the distinction between items rated for themselves and items rated for the thing pointed to, and 2) the distinction between a review as an item in itself and a review as a pointer.

So there are three kinds of items:

1) simple items that are rated for themselves
2) simple pointers that are rated for the item pointed to
3) reviews, which can be rated for
a) themselves (a clever and informed review will get high marks)
b) usefulness as a review
c) the value of the item being reviewed  

22 Apr 2008 @ 17:26 by best @ : great
I'm agree with you.  

8 Jun 2009 @ 06:04 by jewelry @ : pearl
Read to exercise the brain.
Surround yourself with friends.  

Your Name:
Your URL: (or email)
For verification, please type the word you see on the left:

Other entries in
24 Jun 2007 @ 23:17: Global Assembly now accepting sign ups
26 May 2007 @ 19:26: WiserEarth / Paul Hawken
18 Mar 2007 @ 23:19: Latest InterMix Design
30 Dec 2006 @ 17:53: A Nonviolent Service Arm for the Global Assembly
19 Nov 2006 @ 15:45: Global Assembly Dialog Progress Report
12 Oct 2006 @ 15:49: True Religion Creates Community
1 Oct 2006 @ 18:24: Voice of Humanity and the Information Commons?
24 Sep 2006 @ 22:12: The Outsider has a place in the Global Assembly Dialog
17 Sep 2006 @ 20:44: "Unity and Diversity" and "Unity in Diversity"
11 Aug 2006 @ 05:13: The Wedding of Humanity and Nonviolence

[< Back] [voice of humanity] [PermaLink]?