Aardvark

I’ve had this draft post about Aardvark for about two weeks now. Now that they’ve been acquired by Google, I guess it’s about time to finally publish it.

I first heard about Aardvark via the Seattle Tech Startups mailing list and eventually got around to trying it. Few things get past my initial attempt, but I’ve still got Aardvark. It’s a question-and-answer service where you can ask questions yourself and answer questions of others.

What I’ve enjoyed the most about Aardvark (beyond it’s ability to send questions to the right people) is how easy to use and friendly it is. I interact with it via instant messenger, and every message it sends me includes all the instructions I need, in a friendly way, without being too verbose either. It’s impossible to not understand how to use it.

Recently they published a paper - Anatomy of a Large-Scale Social Search Engine (the name is a reference to a famous Google paper) - which I found quite interesting. I was expecting more statistics about the usefulness and value of Aardvark than the paper had, however the interesting part is that Aardvark turned out to be far more sophisticated than I’d realized. As I read it I’d think of a way to make it even better, and later on in the paper, find that they’d already done that. One astonishing graphic in the paper is their graph of users over time; that’s some impressive growth.

Now that Google’s bought them, I only hope that they’ll allow the founders to keep doing the good job they’ve been doing… I’ve seen too many excellent products wither after acquisition (e.g. dodgeball and jotspot).

On the US Gov’t “Going Google”

On the US Gov’t “Going Google” - very true. Google has had great success, and thus they are a problem.

MapDotNet : Blog : Google voice recognition software used in street data collection process?

Last week Google replaced thier base maps in the US, no longer using data from TeleAtlas. The new data isn’t from OpenStreetMap either. I think we are finally starting to figure out where it all came from: MapDotNet : Blog : Google voice recognition software used in street data collection process?

Why Facebook Shouldn’t Fear OpenSocial

Why Facebook Shouldn’t Fear OpenSocial - I’m supposed to be studying, so of course it’s a good time to do some blogging.

Anyhow, I agree with Josh that the idea that the competition now being Facebook vs OpenSocial is silly. Facebook is doing an absolutely amazingly fantastic job pleasing users, developers, being innovative, and soon, generating profit. Their upcoming “Beacon” plans seem as brilliant as their previous ones. The only bad thing I have to say about them (from a business perspective), is that they have been way to slow getting their advertising products out. In the long run, that may not make much difference.

OpenSocial is not competition in any sense of the word. It’s just a little specification to standardize some web services, which is a good thing. And assuming it gains the traction it is expected (the supporters actually follow through), then Facebook will just join it too, and they haven’t lost anything, really. In fact they’ll have gained additional developers and applications.

Facebook would have to be really stupid to act any other way, and from what I’ve seen, they are anything but. Except their HR, I’m not so in love with that.

Is it just me, or is MySpace sitting on their laurels? Just copying Facebook isn’t going to do it, and besides, they don’t seem to be copying them very well or quickly. I thought being the major player was supposed to count for something, like having resources.

One last comment on OpenSocial… while it is certainly good for developers that there will be a common API, let’s not forget that this simply means it will be easy to have an application run on multiple websites… separately. Having an application that seamlessly uses more than one social website simultaneously will still be an enormous headache. So there’s plenty more to be done there.

Update Nov 5. After reading a few things elsewhere, maybe myspace isn’t doing nothing, they just decided to let Google deal with all their advertising, and hope to make enough from that. But since that will likely be almost all of their revenue, might that not be a bad idea?

Official Google Maps API Blog: Microformats in Google Maps

Official Google Maps API Blog: Microformats in Google Maps - they could do a lot more, but it’s a good start.

Acquisitions

I think there are two web companies in the last few years that I’ve really wrote here about how great I thought they were, JotSpot and Feedburner. Now 2/2 acquired by Google… of course, let’s hope this purchase goes better for Feedburner than it appears the JotSpot one went for them.

Google Webmaster Tools - Webmaster Tools

Google Webmaster Tools - Webmaster Tools - this will now show you all links that Google knows of to your own pages.

Some things I noticed

  • there is a lot of spam, especially search result pages and wikipedia clones
  • Google’s indexing multiple versions of wiki pages on pbwiki.com

So I was looking at the stats for Fagan Finder; it’s always nice to see new data, and I’ve never been very good at keeping track of it all myself. According to Google, 64% of the links are to the Translation Wizard. In second place is my article on RSS at 13%, and URLinfo at 8%. 98% of links to the Translation Wizard are websites using the translate this page feature. I had to play with the data in Excel to arrive at this.

Clearly there’s a power law going on here. And this is yet another confirmation that the Translation Wizard was a huge winner, and since it is largely broken now, seriously needs an update. I just do not have the time. Ideas, anyone?

Official Google Blog: Spot On

Official Google Blog: Spot On - Google has bought JotSpot; this is big news to me (and far, far more interesting than YouTube).

I’ve considered applying to work at JotSpot in the past, but one of the reasons I didn’t is that their stuff seemed so good I wasn’t sure if I had anything to add. I first wrote about JotSpot in October 2004. Crazy.

This doesn’t bode very well for Microsoft, that’s for sure. If Google plays their cards right, that is.

Google Code - Updates: New GData API: Google Base

Google Code - Updates: New GData API: Google Base - two of my biggest hopes - Google opening up Google Base, and more wide adoption of APIs based around OpenSearch - all at once. This could be big.

Google Data APIs Protocol

Google Data APIs Protocol - interesting move from Google. I (and others) have thought for a while that combining OpenSearch’s read capabilities with the Atom Publishing Protocol’s write capabilities would create a very powerful API, and that’s roughly what Google is doing here.

It’s great to see the OpenSearch support (a bit - they’re using startIndex, totalResults and itemsPerPage), but I’d like to see them using it more. Some of what they’re doing is contrary to how OpenSearch works (that’s not a problem per-say), as they’re using predefined query names such as q and max-results (and a folder for categories) rather that allowing people to use whichever they want and then specify them in an OpenSearch Description file.

In that same vein, it would be nice to see them make use of autodiscovery, as Atom, RSS, OpenSearch, and others do. Upon first inspection I would say these autodiscovered documents could be OpenSearch Descriptions, but I may be wrong about that.

One interesting thing to note is that they mention how startIndex is 1-based (which is true), and then display an example with a value of “0″. Sounds like DeWitt is right, it does need to handle 0-based numbers too; even Google is making that mistake.

DeWitt brings up some other good points as well.

Via Niall.

Update: Joe Gregorio weighs in

Update 2: Marc Canter (one of my favourite bloggers) finds this linkworthy ;-) although I’m always amazed at the spellings my name gets.

I’m not always right

me, upon learning about GoTo (1998 or perhaps 1999):
This site won’t last five minutes. If anybody can game the system (pay their way to the top), the results will suck and nobody will use it.
reality:
GoTo became Overture, bought AllTheWeb and AltaVista, and got bought by Yahoo for $1.63 billion. Their business model was copied by Google, where it accounts for the vast majority of their enormous and growing revenue, and completely turned the company around. I was wrong because I hadn’t thought of the scale. When one particular company knows they can pay to get higher, the results will suck, but when every other company in the industry knows that too, economics will sort things out; it’s in their best interests to be relevant to searchers.

me, upon learning about PageRank (2000 or 2001 probably):
That’s terrible! If they only show popular sites, those sites will just get more popular and it won’t be long before no new sites ever get found by anyone.
reality:
I was partially correct, however the case of the popular becoming more popular was already occurring anyway, as people followed links. Also PageRank is just part of Google’s algorithm, which also incorporates timeliness, and it is definitely possible for newcomers to get in.

me, upon learning of Technorati (2002)
um… what’s the point? It just shows backlinks, it doesn’t even have any search, nor a “what’s popular” listing (like Daypop for instance).
reality:
One thing I forgot: people love knowing who links to them. Today Technorati does have search (of several sorts), and a whole bunch of “what’s popular” listings. They’re also the most successful (at least in terms of popularity) blog search engine, very innovative, and have spearheaded the wonderful microformats initiative, largely thanks to Tantek.

Google in Waterloo Photos

See previous post for details

PICT0002PICT0005PICT0008PICT0014PICT0020

Google in Waterloo

So I attended Google’s event (party?) on campus today. Not really what I expected, it was essentially an event to meet Google employees and ask them questions, and there was a little talk by Roger Skubowius (of Reqwireless, now Google), then Craig Nevill-Manning of Google New York.

Quite a few people there, as expected, not to mention one or two tv crews. It was essentially a showing-and-planting-the-flag event, with few details. The former Reqwireless people were there as well as two from the Toronto office (mostly AdWords stuff), a couple from Mountain View, and several very recent hires for the new Waterloo office.

I did hear (from Google employees) that they want to expand the Waterloo office with engineers as fast as they can. It seems that wherever they are now is probably a somewhat temporary location. I’m not sure where that is, but it is in Waterloo and not in the Research and Technology Park on the north end of campus (where OpenText and others are).

Not too much to say really. The food was good, especially the giant Google cake. Maybe when I get home I’ll post some photos. Update - photos here.

Update: notes elsewhere from Adrian Dewhurst, Girl Mathie, SunShine, Dmitry, and Imprint.

PICT0014


PICT0014

14-Mar-06 14:06:18

PICT0008


PICT0008

14-Mar-06 13:15:33

PICT0005


PICT0005

14-Mar-06 12:51:09

PICT0020


PICT0020
14-Mar-06 14:40:42

PICT0002


PICT0002

14-Mar-06 12:09:26

PICT0014

PICT0014
14-Mar-06 14:06:18

PICT0008

PICT0008
14-Mar-06 13:15:33

PICT0005

PICT0005
14-Mar-06 12:51:09

PICT0002

PICT0002
14-Mar-06 12:09:26

Google Toolbar Button API Follow-up

In my last post was my initital reaction to this new API from Google. It’s not surprising that I’m worried about Google’s plans here, as their record on XML cooperation hasn’t been all that stellar. I haven’t fully looked into it yet, but I had noticed Google’s absence from a new standardization effort; Retailers, Engines Want Standard for Product Description (via Gary) lists MSN, Yahoo!, and others.

Anyhow, getting down to the real point, I’ve decided to completely skip over “What Google Should Have Done,” and go right ahead to “What Google Should Now Do.” Save myself the wasted keystrokes.

Step 1: Fix Feed Refresh Interval

Remove the refresh-interval attribute from <feed>. Add it to RSS/Atom in a namespace. This shouldn’t really change anything. This has nothing to do with OpenSearch by the way, it’s just my general opinion on XML - extend an existing format rather than creating a new one.

After I started writing this, DeWitt posted his take on it all: Google Toolbar, Custom Buttons, and OpenSearch. It includes a lot of what I was going to say, so I will continue my comments as a reply to his post.

A final note, for anyone that’s counting… this makes at least four different Google products that are RSS/Atom readers (Google Reader, Google Toolbar, Google Personalized Homepage, Google Desktop). I hope they’re all using the API that the Google Reader team has been developing.

Google Toolbar API - Guide to Making Custom Button

Google Toolbar API - Guide to Making Custom Button - aaaargh. I see Google’s recreated the OpenSearch Description format. Nice job guys. Oh yeah, and it also functions as an RSS feed information thingy…. which as far as I can tell, only provides refresh rate…. if they need that so badly they could make that element an extension to RSS/Atom.

It seems like Google’s attitude nowadays is “developers like APIs, and they like XML, so lets create lots and lots of little tiny APIs and new XML formats.” How about a new search API, like for images. The web search API was last updated years ago… . Oh, in case we’re counting, Google now has created XML formats for sitemaps (but they accept RSS and Atom, so what was the point?), homepage modules (why not use HTML, as I’ve written before?), “buttons” (Google Toolbar), 50 (exaggeration) kinds of microcontent (Google Base), etc.

More later when I get back from school and have time to look into this more fully.

Google Code: Web Authoring Statistics

Google Code: Web Authoring Statistics - excellent stats from Google on HTML usage. There are so many times that I have wanted this. I have, however, made use somewhat more basic stats on RSS from Syndic8.