This page showcases a live use case of the real-time decentralized data analytics of DIAS. The nodes of the DIAS network are mapped to 28 countries generating GDELT news events. A DIAS node represents a GDELT country that disseminates into the DIAS network the number of news generated during the last 15 minutes and at the same time aggregates the number of news generated by the other GDELT country nodes. The goal of the demonstrated data analytics is each country to compute in a decentralized fashion the total number of news generated by all countries participating in the network.

System is Running since:

Next Update

The live plot shows the total number of news events reported to GDELT over time (y and x axis). The plot shows 3 main time series data about the total number of GDELT event news:

  1. GDELT Actual: The actual raw values extracted from GDELT and saved in a database. They are used as a baseline for comparisons: how well DIAS can approximate the actual GDELT values.
  2. DIAS Actual: 9 representative state values, e.g. low, medium and high profiles, are used as input in DIAS nodes to make a decentralized aggregation feasible in a distributed network. Each country node uses the GDELT Actual data to periodically extract the DIAS Actual data by sliding a window of 27 observations and uniformly sampling 9 values.
  3. DIAS Estimated: These are the estimated values of the DIAS nodes as a result of the decentralized computations. DIAS is designed to accurately approximate the DIAS Actual data and as a result approximate the raw data as well, i.e. GDELT Actual.

The live plot updates every 15 minutes and the user can click to turn on or off the three time series as well as the data from each country node. This is the non-technical part. A technical part outlining the solution will follow the DIAS deployment infrastructure consists of the following components as shown in the figure below:

  1. GDELT Crawler: A crawler fetches from the GDELT API the news updates every 15 minutes. It sends the updates to the Demonstrator Database and the Mockup Devices.
  2. Live Demonstrator Database: It stores all the data shown in the live plot serving the live demonstrator.
  3. Extraction Engine: It uses the GDELT Actual data to extract the possible states of the DIAS nodes.
  4. Mockup Devices: These are the clients emulating the GDELT countries and are mapped one to one to DIAS Nodes.
  5. DIAS Nodes: The nodes that perform the decentralized data analytics. The execute the peer sampling serving gossiping protocol and the DIAS aggregation with the possible states of the Mockup Devices.
  • All of the above components are deployed in distributed servers of Hetzner
  • Every DIAS Node runs in a separate JVM. Communication is performed using the ZeroMQ.
  • More implementation details about the live demonstrator can be found here.