Searchable Bookmarks, Social Bookmarking, and the Web

With search engines becoming less and less useful, personal and social bookmarking may become one of the new ways to find information on the net. I had some ideas about this before social bookmarking started to take off (probably the same ideas as other people), but sat on them for too long. Meanwhile Yahoo has created almost exactly what I envisioned. del.icio.us was something like what I was thinking but not quite. The basic idea I'm going to discuss here is being able to search the content of your bookmarks, and using a social network to share bookmarks to form a social search engine powered by people's bookmarks.

Update: Looks like Google has something close to searchable bookmarks - the Search History feature). But no social features and no proper integration into their personal home page. They really could do a lot to make this feature more useful, the interface for it can use some work. If there is an easy way to add/remove bookmarks from an external application (maybe they have an API for it), this could really be useful and would do part of what I'm talking about.

Not the first time I had an idea and then discovered someone else has implemented it. I guess my ideas are pretty obvious extensions of existing technology. There is some truth to the idea that ideas are worthless and implementations are what are valuable. This time I actually had started on the implementation, its getting to a point where its usable. Just another lesson to get started early.

Anyway, for me, it started with bookmarks. First of all, I use many different computers throughout my day. I go to work use my workstation. Then I come home and use either my Linux box, a laptop, and sometimes my Mac Mini for browsing. I have ended up with different set of bookmarks on each machine. I use Firefox in most cases (sometimes Safari on the Mac Mini) so I can copy the bookmarks.html files around to have access to them, but that's just not very convenient. Although I completely missed the simplest solution (using version control like CVS or SVN), I started thinking about bookmark synchronization.

I was not able to find a decent bookmark synchronization utility for Firefox that did what I wanted. So I started working on a bookmark synchronization program that read in Firefox's bookmark.html. But I wanted to take a step further than just a simple utility for one person. I wanted to make it a multi-user web site that held your bookmarks and allowed you to synchronize from that site. Seeing the success of Friendster and the whole social website phenomena, I wanted people to be able to share their bookmarks amongst each other and add "friends." There would be a privacy setting for each bookmark: it could be completely private, only viewable by friends, or viewable by everyone.

But I have hundreds of bookmarks. I bookmark anything that is remotely useful. A lot of times I want to go back to a page I remember bookmarking. The titles of the pages alone are not enough to be able to find a page. And sometimes I bookmark something without paying attention to the title, and it simply has nothing to do with the page's content. It's not all that easy to find a page in my bookmarks a lot of the time. So what would be the best way to find a page? If I could search the content of all those bookmarks that would be the ideal way. So basically I would want a search engine that only gave me the results from a specific set of pages (my bookmarks). It can be taken a step further and search the content of the pages my friends have bookmarked. And maybe I can even search what everyone on the site has (but then we have to start worrying about spammers, which is a topic to think about later).

So what I am envisioning here is a full blown search engine which instead of using a crawler to search the web, it uses the bookmarks of its users to power the search. The social nature helps expand your search possibilities while keeping it narrow enough.

The first major site to do some sort of social bookmarking that I knew of was StumbleUpon. Then del.icio.us came along and really started to bring some attention to the idea. delicious had some nice ideas like tagging, finding other people with the same bookmarks, using bookmarklets to submit bookmarks, and RSS feeds. As cool as it is, there is no way to search the content of the actual pages (which is a pretty big undertaking) so it wasn't quite what I was envisioning. I believe they didn't even have a way to search the whole site for bookmark titles until recently. Still, it brought some great ideas on social bookmarking to the table. Google's personal search engine

So I continued work on my little web application, but only a few hours here and there. I recently got motivated and added an index feature which was generated from the content of all the bookmarked pages. I added a simple search feature which uses this index which can do either and AND or OR search (but no complex queries with mixtures of AND and OR). I've considered looking at Nutch to see if it could be used for these purposes. Or possibly use Lucene alone. But haven't had the chance to fully evaluate that yet.

I have some vacation coming up, part of which I will spend on this project. So I might have something to show sometime soon. Until then its vaporware :)

---

The amount of information on the internet is getting to ridiculous proportions. The signal-to-noise ratio is steadilly on decline. The search engine giants are doing relatively well at ranking the content, but certainly not as good as they used to be able to. It's only going to get worse. Social bookmarking is one approach which may be able to help bring relevant search back. What will be interesting to see is how popular it is among the masses. del.icio.us has not reached mass appeal the way sites like Friendster and MySpace have. It's still is something of a geek site. I don't even use it because it doesn't really do what I'm interested in. It's very cool that a mainstream site like Yahoo is taking a stab at it.

Yahoo doesn't appear to have a full site search. It looks like the full search just uses their regular engine (I could be wrong though). I wonder if they will be feeding the data into their regular search engine to help in ranking. In my idea for this, with the ability to add friends, you can add people you trust to have bookmarks that aren't spam. But when you search the whole site, this introduces the problem of spammers. People will intentionally create a accounts just to add links to their sites. There will have to be a way of filtering out these spammers.

Maybe doing a search of all connected users in your network (like Friendster), with the option to specify degrees of seperation, would be more useful than a full site search (though probably more expensive to filter the results that way). Or maybe some method of examining patterns may help, such as taking into account how many friends a user has, what kind of links they have, what IP addresses they are connecting from, how often they submit bookmarks, etc. Maybe the full site search should only give results from users who made it through the filters. It's a difficult problem, and worsened by the fact that no matter what technology you implement someone will figure out a way to spam it.