Mycroft Web/Journal Search Plugins Fixing Project

22 November, 2004 at 19:37 1 comment

Inspired from Linus Trovalds’ interview[1], I have set myself the
modest task of correcting/updating/cleaning/modifying/*ing the Mycroft
Search Plugins. It is just too good a technology to be done as badly as
it is being done. Particularly, I am trying to fix the WWW and
Literature/Journal Search Engine Stuff. The reason for this rather
philanthropic effort is for my own selfish ends of having fun with
search engine rankings and ‘sense'[2] the WWW as we know it by
integrating other cool technologies such as “Robust Signatures”[3] to
keep a tab on such phenomenon as Googledance[4].
Anyone who has used and observed the
nuances and subtleties of the Mycroft search plugins (on Mozilla not
Firefox) and the functionality they provide (if properly done) will
perhaps agree with me and conclude that it is a potential gold-mine to
conduct experiments. Further, Mycroft can also be used to do cool stuff
and help users in achieving things like writing an essay in a snap. For
example, just search for something in Google, Funquotes, Dictionary,
Quotomatic, Wikipedia, Freesearch Pictures and other engine obscura
etc. and one can put an essay together in a good critical structured
way of starting with a quote, then a joke, some pictures, then the meat
mixed with trivia, then conclusions and then references. Just try the
search on the engines mentioned and one gets a hazy idea.

Given this scientific and end-user potential, I think it is justified
to make the MyCroft work ‘perfectly’ and ‘beautifully'[5] for the most
popular used search engines in existence. Here is the list I’ve got so
far –

Google Teoma MSN Citeseer Lycos Feedster HotBot Metafilter
DMOZ Terrier(s)
AskJeeves Altavista AllTheWeb Freesearch
Scirus Vivisimo Looksmart Yahoo Dogpile Metacrawler ACM-DL
Clusty Technorati LS-Articles
1) Clicking on the
list should give me a proper preview of the link/summary
2) The preview should be in a visible font (it is too tiny)
3) The preview could be a 600×480 thumbnail from Thumbshots/Alexa
4) Just as list, it would be cool to have a grid of thumbnails of 1st
ranked docs
5) Ability to get back a document within results (see [3] below)
outwith Top-10

Contact: If anyone is willing
to join me in this effort, please do. You can reach me through –
sriks-www (a) dcs _ gla _ ac _ uk

Linus Torvalds:
Market has already started”
St. Pierre of Linux Times interviews Linus Torvalds

should start to undertake a large
You start with a small _trivial_ project, and you should never expect
it to get large. If you do, you’ll just overdesign and generally think
it is more important than it likely is at that stage. Or worse, you
might be scared away by the sheer size of the work you envision”

“So start small, and
think about the details. Don’t think about some
big picture and fancy design. If it doesn’t solve some fairly immediate
need, it’s almost certainly over-designed. And don’t expect people to
jump in and help you. That’s not how these things work. You need to get
something half-way _useful_ first, and then others will say “hey, that
_almost_ works for me”, and they’ll get involved in the project”

“And if there is
anything I’ve learnt from Linux, it’s that projects
have a life of their own, and you should _not_ try to enforce your
“vision” too strongly on them. Most often you’re wrong anyway, and if
you’re not flexible and willing to take input from others (and willing
to change direction when it turned out your vision was flawed), you’ll
never get anything good done”

[2] Secret Project.
Have to write an RFC very soon
[3] Robust Signatures: A fusion project (another document to be written) of location and content analyses involving –
Hyperlinks and Locations Archived Site

Persistant URL’s
Signatures from Multivalent Project

If we wanted to keep track of
our own pages, we could just include a signature
and wait for a couple of weeks after submission to search engines. When
they are done crawling and indexing, we can use Mycroft to see how our page ranks in
the overall index and for abitrary queries…
[4] Googledance – Googledance

“Because Google stores copies of
their index around the world — as it
is being updated you can search the same phrase at different times and
get different results. lets you watch your results
across the indexes”

Now we can do the same thing by comparing different search engines and
one particular page. We still use multiple indexes but now those of
different companies…
[5] Unknown Quote
“It is a universal duality. The mind
seeks perfection. The heart
seeks beauty”

Beauty and Perfection. So much to talk about. So, little space. So
little time…


