The TRS Project

14 December, 2004 at 11:43 5 comments



TRS
or Time Re-Sorter
is exactly what
it says on the tin. It is
a library that re-sorts stuff according to pre-fixed bins
.
(poetic,
ain’t
it?)
The project has had its origins when there was a call to write a more
naturalistic GAIM log
browser for the Beagle
project. Thanks
to the inspiration provided by Tuomas
through his mockups
and Nat through his suggestion of
start-small, the authors, Raf
and Srikant took up the
challenge and combined their efforts and enthusiasm in true
open-source spirit and voila, came about with TRS.
           
    In the course of their discussions, they implemented
the log browser using a generic TRS architecture. They then decided to
give TRS a life of its own as the functionality of naturalistic sorting
(or more precisely, labelling-via-rules)
could have a utility well
beyond its intended purpose. Because a picture speaks a 1000 words,
here is what TRS does as of today –


TRS Screenshots


TRS Screenshots


TRS Screenshots



More on TRS  

As one can see from above, we have a nice
seperation of GAIM log files into naturalastically rendered ‘timeslices’. We have
both a pure Java viewer and a GTK# viewer highlighting how easy it is
to build such viewers leaving the main allocation work to TRS. Further,
we can
also see TRS being used to view files as per their
modification date highlighting how trivial it is to build parsers to
sort any timestamped bunch of entities. These are working screenshots
but much
needs to be done and we are working on it. Any contribution is
always welcome 🙂


 Coming back, TRS uses a
smart bin-based algorithm and its own internal
datastructure (Red-Black Trees courtesy of an existing Java library due
to <>) to achieve its functionality without
sacrificing speed. TRS can take in any timestamped entities in the form
of ‘tuples’ (with empty values
if desired) and put them in pre-fixed ‘bins’
(like TODAY, THIS-WEEK, LAST-MONTH etc.) that can be specified at
compile-time in the source code. It is probably apparent to most
readers that such re-sorting functionality is tremendously useful since
every digital entity has a timestamp associated with it by default.
Whether it be files, emails, IM logs, transactions, bookmarks, surf
history, calendar
events, tasks etc. Time by itself has much value and if used
intelligently, can yield a lot of benefit to users manage/find/browse
through their information.



Using/Extending/Discussing TRS  

We used a clean and modular design/architecture to implement
TRS as per the figure below –

TRS Components

Logs – Any bunch of entities
with timestamp information that need to be re-sorted. Eg. GAIM
conversations, Email, Files…
Parsers – Are code or
standalone applications that can read in Logs and generate a ‘tuple’ iXML file(s)
[look at comments for the DTD]. If a Java class, the tuples can be
‘streamed’ to
TRS
TRS – It is the core library.
For most practical purposes, just a
black-box. It takes in an iXML file(s) of
tuples and generates labelled oXML file(s)
using a ‘bin-allocation’ algorithm
XML Readers/Printers – Utility
functions in TRS which process XML strictly adhering to DTD’s defined.
A pure Java application can bypass them using the inbuilt ‘streaming’
tunnel
Viewers – Are code or
standalone applications that can read ‘labelled’ oXML
file(s) [look at comments for the DTD] and display them to user. If a
Java class,
the elements can be ‘streamed’ from TRS


Thus, TRS is a trivially easy to
use XML i/o based library so that any
application(s) can use TRS individually or combinedly. The
FileModifiedXXX() is an example combo of such an application.
If an
application is written in Java, it can bypass the XML stages. The
GaimLogBrowser() is an example of such an application
combo.
From this, we hope
to convince people that it is quite easy to write an
application combo for anything like email (needs
an mbox parser atleast and a clever viewer [a line graph with time on
X-axis, cardinality on Y-axis and addressbook on Z-axis] atmost). All
is needed is a simple parser and a desirable viewer if the one provided
with TRS is not enough.

In other words, TRS takes in a list of timestamped entities
(conversations, emails, browser history, search results, bookmarks
first/last visited, history…) and then groups them together with
labels that
any front-end app (like Beagle) could then use for whatever reasons
(like display) as per this early figure –

IM log browser mockup, call for hackers?

TRS could have sub-components (like different languages for
labels or libraries for viewers) and different sorting mechanisms for
differential i/o processing. This is in tune with the
<presentation> about <Dashboard> sometime ago, which
were the kind of things that were planned i.e. application independence
and a re-usable streaming architecture.


Writing a Parser:
The iXML
DTD is quite easy-going. Any parser should give TRS a ‘tuple’
with a timestamp field filled in. A tuple is –
<timestamp, snippet, link>
TRS will only use the timestamp
when sorting into bins. The rest could be used for info-sensible
display. The choice of snippet
is entirely upto the parser. It could contain an ‘intelligent’
start-point of an IM conversation or minimal email headers or a keyword
list in a text-file, a thumbnail path etc. The link could be the absolute path of
the IM log or a position in a file or progam invocator or a URL etc.
Again, it is a choice entirely left to the justification of the parser.
All the parser has to do is to read in Logs and write iXML or stream
the tuples into TRS for processing. Please look at the code of the
parsers provided.


Using TRS:
Usage is simple and straightforward. First an iXML file needs
to be generated by the parser –
$
java -jar trs.jar FileModifiedParser <directory> -o in.xml

 Parsers need not be written in Java. TRS just requires an iXML file no
matter who generates it. Next, TRS needs to be invoked –
$
java -jar trs.jar TRS –i
in.xml –o out.xml

This step generates an oXML file which
is then used by a viewer for display –
$
java
-jar
trs.jar
FileModifiedViewer
out.xml

Viewers need not be written in Java. We are ourselves surprised at the
modiularity but if pure Java is comfortable, then TRS can be used
internally within a controller application –
$
java -jar trs.jar GaimLogBrowser <path-to-GAIM-logs>

If the labels and timeslice ranges need to be changed, please modify
the code of TRS.java and re-compile –
$
./compile.sh

Use as before…
Note:
TRS can also be used programmatically and a C# port is being written to
this effect


Writing a Viewer:
The oXML
DTD is quite easy-going as well and is also quite robust. Any
viewer can take this XML file and decide what it wants to with it to
its whim and fancy. That is all that needs to be said. Please look at
the code of the viewers provided. We have included both a Java and GTK#
viewer for reference.


For the Curious:
Here is the latest UML snapshot of TRS. It is still not very clean but
it works and will be modularized to work exactly as above. Right, UML –

TRS-UML-dec-13


If anyone has any suggestions of how this could be done better or
faster or if we are re-inventing something, do let us know.


Downloads:
The latest version of TRS can be
<downloaded from here>. We are looking to export this project to
Sourceforge very soon so that we will have a CVS repository
established. We did not think there was a need to do it as TRS should
primarily be seen as a black-box with contributions of parsers/viewers
ideally proceeding independently from TRS. Till there is a
better way to integrate development, please send in your code
contributions by email to the authors (who will work to improve TRS
core and be helpful) and we will combine them all together into single
one-shot
releases whenever anything significant happens.



News and Status-Quo 

Dec’13 – 23:00hrs
Good night, I just finished the first part of a red-black tree,
insertions works, but not removal. We’re just steps away from a trivial
C# port for Beagle developers…

Dec’13 – 19:30hrs
Adding TimeSlices on the fly is now done and works. When you add a
slice, only the events that are concerned (that are in the old bigger
slice) are removed from that old slice and then reinserted in the new
one, that’s O(N log N) I suppose. Good work!

Dec’13 15:50hrs

EventParser now has two constructors one with a TRS, one with a File
and one with both. When you build the Parser, it generates either
addEvent in the trs, or it appends the event in xml to the specified
file, or both.
So with this system, the gaimlogeventparser has not chaged since all of
this is handled in the abstract class EventParser in
fireNewEventParsed.
With this , all needs to be done is to write a XmlEventParser that only
take a File as input and a TRS in the constructor (no output file
because it has no sense, it would generate an exact copy as output) and
TRS will get the events as the XML is parsed !

I’m myself amazed with this modularity



To-Do’s, Questions, Bugs, Feature Requests et al. 

To-Do’s:

Inline Find
C# Port
Bins
Rules
Testing
Code Cleanup


Questions:
1) How to split XML files and achieve granularity ala Russian dolls?
2) How to get created/accessed timestamps of files in Java? There is
some library that escapes our memory
3) For any issues whatsoever, please contact the developers or
better still, use the comments system on this page


Bugs:



Advertisements

Entry filed under: Uncategorized.

GU: Top 100 Engg. and IT Universities Google Desktop Proxy

5 Comments Add your own

  • 1. S 'naani' J  |  14 December, 2004 at 13:01

    iXML DTD
    —-
    [?xml version=’1.0′ encoding=”UTF-8″?]
    [!ELEMENT tuples (event+)]
    [!ELEMENT event (timestamp,snippet,link)]
    [!ELEMENT timestamp (#PCDATA)]
    [!ELEMENT snippet (#PCDATA)]
    [!ELEMENT link (#PCDATA)]
    —-
    Please replace []’s with <>‘s

    Like

    Reply
  • 2. S 'naani' J  |  14 December, 2004 at 13:02

    oXML DTD
    —-
    [?xml version=’1.0′ encoding=”UTF-8″?]
    [!ELEMENT bins (pastbins,defaultbins,futurebins)]

    [!ELEMENT defaultbins (timeslice+)]
    [!ELEMENT pastbins (timeslice)]
    [!ELEMENT futurebins (timeslice)]

    [!ELEMENT timeslice (label,begin,event+)]
    [!ELEMENT label (#PCDATA)]
    [!ELEMENT begin (#PCDATA)]
    [!ELEMENT event (timestamp,snippet,link)]
    [!ELEMENT timestamp (#PCDATA)]
    [!ELEMENT snippet (#PCDATA)]
    [!ELEMENT link (#PCDATA)]
    —-
    Please replace []’s with <>‘s

    Like

    Reply
  • 3. S 'naani' J  |  14 December, 2004 at 13:40

    IM Log Browser Mockup, Call for Hackers?
    Tue Dec 7 9:00 pm
    Tuomas Kuosmanen, Novell to Dashboard Hackers
    —-
    Hello!

    I recently designed a mockup for a IM log browser that would in my opinion be better than the one in GAIM currently. The improvements mainly are in the way the list of im logs are displayed, so that they are grouped per date in a sensible way.

    More information and the mockups can be seen here:
    tigert.gimp.org/log/archives/2004/11/29/date-revisited-new-thoughts

    If someone wants a small hacking project, this might be a good one to start with. I am happy to provide assistance with the user interface etc..

    The idea is, that when you click an IM log in Best, it would open this log browser with the actual log selected, and the search term in the search field, and the rest of the matching logs grouped by timeline in the list.

    Anyway, interested? Let me know! : -)

    Tuomas

    Like

    Reply
  • 4. S 'naani' J  |  14 December, 2004 at 14:14

    Re: IM Log Browser Mockup, Call for Hackers…
    Date: Wed Dec 8 1:14 pm
    From: Lukas Lipka, Pmad
    —-
    Ive worked on a IM log viewer for Beagle a while back. It never got in because it used Gtk Html and we switched to gecko. I would be able to write the one tigert has mocked up if I get time (hopefully soon). But I have a dilema whethever it should be a part of Beagle or a standalone app so anynone (like Gaim) could use it. Anyone with some useful ideas?

    Best,
    Lukas

    Like

    Reply
  • 5. S 'naani' J  |  3 April, 2006 at 22:53

    I also found (I think) an elegant data structure for the thing using TreeSet’s of java.

    So we have insertion of new TimeSlice’s in O(log Nslices) This is done once at the creation of the TimeReSorter, so it is
    basically a O(N * log N) for N slices to create.

    We also have O(log Nevents) insertion for an event in a given category
    (events are sorted even if in the same bin).

    And we have an O(log Nslices) for querying the trs to give the events corresponding to a given timeslice.
    The iterator for all the events with a given timeslice is so O(Nevents * log Nslices).

    I think these are quite accepable performance, seen we have sorts to do
    anyway.

    The code I joined contains also a file modification date parser, so it
    can generate a lot of events to judge the performance of the thing, and
    I must say it’s quite fast ! (It indexed nearly 10000 files + logs in
    seconds)

    Like

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


Calendar

December 2004
M T W T F S S
« Nov   Jan »
 12345
6789101112
13141516171819
20212223242526
2728293031  

Tweets


%d bloggers like this: