Sunday, May 10, 2009

Charting

I've been asked to survey some of the existing charting java libraries to see which library best matches are requirements and will not take a tremendous amount of time or effort. After reviewing the API's, documentation, tutorials, etc. I recommended JFreeChart for the following reasons:
1) It is well known which can yield more credibility than other lesser known charting libraries.
2) I have some experience with JFreeChart already so hopefully that will reduce the amount of time needed to integrate the library.
3) It has all of the charting features that we will require: pie charts, histograms, and scatter-plots.
4) It also produces "pretty" charts.

Of course this is one aspect of the GUI that we will need to work on.

Saturday, May 2, 2009

Blog comments

Hello all, apparently I was not subscribed to the comment feeds from my blog :) Eek Sorry all and thank you for your comments, they are greatly appreciated. In the process of learning the source code I have decided to try and write one of the network metrics, primarily for the purposes of learning the data structures. There are a few design questions I'm running into, in this process.

#1) How to find the directed-ness of an edge or graph.
#2) Some algorithms (modularity) might be interested in edge weights, does Gephi support edge weights?
#3) Where does the output from the standard metrics go (visually).
#4) As it turns out, there are a small number of Macs that don't support Java 1.6 as the chips in these machines don't support 64 bit mode (or something along this line). Its not the first time Steve Jobs has let me down, but this means I have to do my development on my Ubuntu machine, not a huge issue, but it may face some of the other GSOCers or other Gephi developers.

Wednesday, April 29, 2009

Update

I have not had too much time to the papers just yet, but classes just ended and I should be much more free in the coming weeks to work on this project. I was able to get the Wiki site up: http://gephigsoc.wikispaces.com/

I'll be adding some content this weekend. I'm hoping to add a page for problems that we encounter so that we can learn from each others' mistakes, etc.

One of my first goals is to write a proof of concept algorithm, I think around clustering coefficient to ensure that I know how to access the topological links of a given network. I realize that this is slightly out of order, but it falls under the heading of learning the existing code (and I get to complete one of the metrics on my list).

I believe that I understand the code to run the metrics, however, my question at the moment is how to display the results to the user and how to get input (if needed from the user).

Saturday, April 25, 2009

Network Algorithm and Stastics

Hello everyone, this is my first post in the process of working on a Google Summer of Code 2009 project. The organization is Gephi which is an open-source software for visualizing and analyzing large networks graphs. My project (as the title of this blog suggests) is to develop code for some common (and some cutting-edge) network statistics and algorithms for the Gephi platform. My mentor is Sebastien Heymann.

So far I've been able to:

1) Install netbeans
2) Download the 0.7 branch of Gephi (via bzr)
3) Start to examine the new Network Statistics API.

This is my second year in GSOC and I'm hoping this blog will have a better fate than last year's blog which dwindeled off after only a few posts.

Currently the statistics, metrics, etc. that we are planning to implement this summer are:
1) Clustering Coefficient
2) PageRank
3) HITS
4) Graph Distance metrics
a) Average shortest path
b) Network diameter
c) Node Betweeness centrality
d) Node Closeness centrality
5) Modularity (Community Detection)
6) Degree Distribution (scale-free power)
My full proposal can be found at: http://web.ecs.syr.edu/~pjmcswee/gephi.pdf


At present one of my main questions will be whether or not we need to differentiate between large and small networks as some of these algorithms are very time consuming. There are approximations which are faster but less accurate for larger graphs but slower more accurate algorithms exist for smaller graphs. Some investigation may be necessary to evaluate this issue.


I'm also working on setting up a wiki for my fellow Gephi GSOC-ers. I'm not sure where I should post it, so for now I'll set it up on my server, although I'm under the impression a better place may be at gephi.org.

That's all for now... more to come