当前位置: 动力学知识库 > 问答 > 编程问答 >

d3.js - Big data visualization using "search, show context, and expand on demand" concept

问题描述:

I'm trying to visualize a really huge network (3M nodes and 13M edges) stored in a database. For real-time interactivity, I plan to show only a portion of the graph based on user queries and expand it on demand. For instance, when a user clicks a node, I expand its neighborhood. (This is called "Search, Show Context, Expand on Demand" on this paper).

I have looked into several visualization tools, including Gephi, D3, etc. They take a text file as input, but I don't have any idea how they can connect a database and update the graph based on users' interaction.

The linked paper implemented a system like that, but they didn't describe the tools they were using.

How can I visualize such data with above criteria?

网友答案:

There are several solutions out there, but basically every one is using the same approach:

  1. create layer on top of your source to let you query at high level
  2. create a front end layer to talk with the level explained above
  3. use the visualization tool you want

As miro marchi pointed, there are several solutions to achieve this goal, some of them locked to particular data sources others with much more freedom but that would require some coding skills.

Datasource

I would start with the choice of the source type: from the type of data probably I would choice either Neo4J, Titan or OrientDB (if you fancy something more exotic with some sort of flexibility). All of them offer a JSON REST API, the former with a proprietary system and language (Cypher) and the other two using the Blueprint / Rexster system. Neo4J supports the Blueprint stack as well if you like Gremlin over Cypher.

For other solutions, such other NoSQL or SQL db probably you have to code a layer above with the relative REST API, but it will work as well - I wouldn't recommend that for the kind of data you have though.

Now, only the third point is left and here you have several choices.

Generic Viz tools

  • Sigma.js it's a free and open source tool for graph visualization quite nice. Linkurious is using a fork version of it as far as I know in their product.

  • Keylines it's a commercial graph visualization tool, with advanced stylings, analytics and layouts, and they provide copy/paste demos if you are using Neo4J or Titan. It is not free, but it does support even older browsers - IE7 onwards...

  • VivaGraph it's another free and open source tool for graph visualization tool - but it has a smaller community compared to SigmaJS.

  • D3.js it's the factotum for data visualization, you can do basically every kind of visualization based on that, but the learning curve is quite steep.

  • Gephi is another free and open source desktop solution, you have to use an external plugin with that probably but it does support most of the formats out there - graphML, CSV, Neo4J, etc...

Vendor specific

  • Linkurious it's a commercial Neo4J specific complete tool to search/investigate data.

  • Neo4J web-admin console - even if it's basic they've improved a lot with the newer version 2.x.x, based on D3.js.

There are also other solutions that I probably forgot to mention, but the ones above should offer a good variety.

Other nodes

The JS tools above will visualize well up to 1500/2000 nodes at once, due to JS limits.
If you want to visualize bigger stuff - while expanding - I would to recommend desktop solutions such Gephi.

Disclaimer

I'm part of the the Keylines dev team.

网友答案:

I'm a newby too, but I'm interested in this topic so I wish to share some information.

  • Gephi is shipped also as a jar file - gephi toolkit, a java library you can install on a server (you will need java on the server), and this way you can read directly from a database. And then use sigma.js javascript library to display the calculated network in the browser.
  • Another option is to use Tulip + python, following advice from Alberto Cottica in his blog. Some near-real time as he says. + sigma.js.
  • Sigma.js, by Alexis Jacomy, is designed specifically for large networks visualization, using Canvas or WebGL, so for the js part I suggest it.
  • If you have a neo4j graph database (or want to have one) there is a specific web platform linkurious (they offer enterprise services at a price) which does all the job, storing, exploring, collaborating, visualizing, and so on. Gephi and Linkurious have a linked story (Sébastien Heymann co-creator of the former, CEO of the latter).
  • They also recently released linkurious.js, which is an enhancement fo sigma.js library (both for visualization and performance), open source, extensible (modular). For instance if you use the webGL render (beta), you can display up to 20000 nodes with a pre-computed layout (read more).
  • I also suggest this post by Max de Marzi, about using visualsearch.js coupled with neo4j graph database (for visual queries using Cypher language).
  • Oh, and take a look at VivaGraph.js too. Againg in WebGL or svg... Max de Marzi has a post on this too.
  • And for the paying services there's also keylines.

My case: retebuonvivere.org - for the moment I'm using a drupal website to collect the data. Then views module for creating the query, and d3 module (working with d3js library) to displaying it. But I believe this is only a good workflow for a small network (at the moment thare are 150 nodes and 177 edges and it is cool). I tell you this just because maybe it gives you a hint on some new workflow.

I hope this all may help in giving some hints.

分享给朋友:
您可能感兴趣的文章:
随机阅读: