Monday, June 22, 2009

Delivery #3

I've just pushed delivery #3 which includes working versions of Clustering Coefficient, Degree Distribution, and Graph Distance metrics. I keep meaning to go into detail on the wiki on the design details (diagrams, etc.); I haven't found the time as yet, will get to it this week.

I had to spend some time resolving packaging issues and getting everything up-to-date and merged.

Some good advances today. Created package for handling the gui and an interface to allow gui parametrization to run smoothly.



Aside from continuing to work on the framework, beefing up the parametrization and the reporting, there remains only three main statistics: HITS, Pagerank, and Modularity.





In case anyone other then my mentors want to try and play with what's done so far, the bzr repository can be reached via
bzr branch lp:~pjmcswee/gephi/Statistics

Below are some notes concerning the current state of affairs.

Accomplished:
(1) Statistics are now run by the LongTaskExecutor and implement
LongTask (thank you Mathieu).
(2) I've checked everything in that I think you'll need for running
the project. Thank you for your patience with me on this issue :)
(3) Clustering Coefficient, Degree Distribution & Graph Distance
metrics all run successfully (see issues).
(4) I've created a StatisticsUI similar to GenerateUI that is used to
parametrize the statistics.


To Do:
(1) Continue to move the GUI out of StatisticsControllerImpl and
StatisticsReporterImpl
(2) Add onto the parametrization & reports of the existing algorithms
(3) Get the 'Print' functionality working correctly in the
StatisticsReporterImpl.
(4) Add a 'Save' option to save the
(5) Increase the reporting, time stamp, network basics, network name, etc.

Issues:
(1) There is a bug in the Triangle version of clustering coefficient
when the network is looked at from a directed point of view. Notice
that if you run as undirected you will get the same result for either
brute force or triangles. If you run directed they disagree. I will resolve this week.
(2) The combo box menu on the toplevel component uses the actual
Statistics objects. I'm not a huge fan of this as then we need to
make all of them implement toString(), when we already have getName().
Seems like there should be a cleaner solution which uses the existing
getName().


Suggestions:
(1) In the data panel, it may not be a bad idea to add a section for
per Network attributes. Many of the statistics return results that
could be placed there for easy referral: average clustering
coefficient, diameter, mean shortest path, degree distribution
(scalefree-ness), modularity, etc.
(2) I am seeing a lot of similarity between the Generator and the
Statistics Modules. Perhaps the GeneratorUI can be generalized and
then used for both packages, as the LongTask module has been.

1 comment:

  1. Very good job, keep going that way :-)

    I merged your changes in the main 0.7 branch, don't forget to remerge your branch from the 0.7 trunk.

    ReplyDelete