Back to blog
Decentralandalytics
Rob-authoredOriginally July 15, 2022

Decentralandalytics - DAO Owned Analytics

By Rob1,424 wordsarticle

Decentraland DAO-owned data aggregation proposal.

Decentralandalytics – DAO owned analytics

We believe Decentraland player usage data should be collected by the DAO and made available to all Decentraland users.

Player location data is currently available through use of the /islands endpoint on DCL content server nodes. This information can be collected periodically (e.g. every few seconds) and stored in a database to be served up via API queries available to the Decentraland community. It is not feasible for multiple parties to independently collect this information as it would cause undue strain on Decentraland infrastructure.

This data should be collected and owned by the DAO, and therefore be owned by Decentraland’s users and not a private company or individual.

Why collect this data and make it available?

As Atlas CORP has been in the analytics business for 18 months, we know that there are some immediate benefits that could be recognized by the Decentraland community through the creation of this service.

Grow the number of successful Decentraland Builders

Move the conversation from “why Decentraland?” to “why your use case?” for builders looking to build clients or raise funds.

Providing high level Decentraland statistics will increase the growth and success of builders. Often the first question asked of teams looking to build or sell in Decentraland focuses on the metaverse itself and not the team’s use case; although Decentraland is the most decentralized metaverse it still exists in a competitive landscape. Investors and clients often need to justify investment or choice of metaverse before they can start to focus on a team’s specific use case.

By providing a data on Daily Active Users, Total User Growth of Decentraland, and traffic by parcel, builders can refer to these open source analytics instead of each team attempting to obtain them by themselves.

Prevent undue load on Decentraland Infrastructure

While the data in question is public, everyone cannot query this data for themselves without an adverse impact on Decentraland nodes.

There has already been discussion in the forums about shutting down external access to player position data due to increased loads felt by Decentraland content server nodes. Too many concurrent requests to these endpoints would have the effect of a DDoS attack which could impact the quality of the service each node can provide.

Therefore instead of a) playing out the Tragedy of the Commons that would occur if everyone collected the data for themselves, b) removing access such that nobody can benefit from this data, or c) allowing the data to be acquired by the highest bidder/private entity – we believe the ecosystem will benefit most from the data collection being done once and everyone sharing in access to that data.

Prevent Private Monopolization of the Data

We have first-hand knowledge of how valuable this data can be to those operating in Decentraland. User position data can be used to determine how many users attended events – critical for event hosts to understand how well their event performed. Daily active user data is crucial to those seeking to invest in the metaverse to help understand returns on investment.

We believe that no private institution should be able to gatekeep this information from the rest of the community. We at Atlas CORP used to make this information freely available using our own private hosting infrastructure, but we outpaced our capacity requiring a move to more dynamic scaling solutions which is why we’re here talking about this proposal.

What Data is in Scope?

This proposal is only for Decentraland user data as reported by the /comms/islands endpoint on Decentraland Content Servers.

This proposal does NOT include:

Content Server data (e.g. scene files, user profile history)

Scene-specific data (e.g. object clicks and interactions)

Any personally identifiable information (PII), excluding ETH wallet address

User IRL location data

Any derived content from a user’s ETH wallet

Data will be collected from all active, registered Decentraland Content Servers which at the time of writing includes Hephaestus, hela, heimdallr, baldr, loki, dg, odin, unicorn, marvel, and athena.

This data set can provide a platform for building more sophisticated reports by the DCL developer community. The DAO could also choose to one day monetize access to this data (e.g. when the data is being used directly for profit), as well as augment what data is collected and made available. It is important to note that this proposal is currently limited in scope to the one data set and free access to the community, and that these ideas could be fodder for future proposals.

How will this work?

Data will be collected every 20 seconds and piped into a Mongo cloud database, with a Digital Ocean server set up to provide API access to queries on the data.

An automated feed will be set up to collect data from an authoritative source of each active DAO node. Collecting data every 20 seconds will result in 3 datapoints a minute; graphs using one-minute granularity will have 3 datapoints on which to aggregate data points per minute. Data collection will be set up redundantly on two or more servers, or on a single load-balanced cluster, to minimize and downtime in data collection. This data is expected to grow at 1Gb per day, which may accelerate with the growth of daily active users.

As the data is natively in JSON, we propose to use Mongo Atlas as the cloud database of choice. The database will also be deployed with multiple nodes to minimize downtime. Mongo provides an easy way to scale for future needs – whether through increasing storage capacity or sharding the deployment for increased API and query load.

To minimize infrastructure costs, we propose keeping only 3 months of data available for public consumption. An automated process will be designed to backup purge, and post historical data such that users can perform more historical analysis without undue cost to the DAO.

An API server will be written to provide simplified API access to the data in the database. These queries may include things like – users per minute (global or per parcel), daily active users (global or per parcel), and unique Decentraland visitors (global or per parcel). The API code can be made open source and hosted on gitlab, but hosted privately on Digital Ocean to prevent unauthorized access to the DAO’s database. Additional access can be granted to additional DAO representatives if deemed appropriate.

API access will remain open, though a throttling mechanism will be put in place per IP address to prevent DDoS of the API servers. In addition, query data (e.g. who’s asking fore what) could be saved in the database and made available as APIs for complete transparency.

A single page dashboard will be made available to show last 24 hours of Decentraland population data and daily active users for the last quarter.

What will this cost the Decentraland DAO?

This project will cost the DAO $57,500 and will take 3 months to develop.

This breaks down into:

$7,500.00 - Infrastructure and Hosting Costs:

Budget for Mongo Atlas DB for one year

Budget for Digital Ocean API servers for one year

Budget for Digital Ocean Data collection servers for one year

ENS domain for three years

$45,000.00- Development Costs (estimated 3 month delivery)

Data collection with redundancy and failover to set up the data collection rails

API query development resources to produce queries and prevent overuse

DevOps and infrastructure development resources to automate as much as possible

Technical Writing resources to provide user-facing API documentation

Front-end development resources to create the dashboard

$4,800 - Ongoing Support Costs (3 months post deployment)

Code Updates when breaking changes occur due to external forces

API query user support in the Decentraland discord (48 hour SLA)

The Development Team

The development of this data platform would be done by the Atlas Corporation team, who have extensive experience working with this data set:

howieDoin – Lead Analytics & Infrastructure innovation

MorrisMustang – Lead DCL & Solidity innovation

josephAaron – Operations and task management

staleDegree – Senior Solidity/UI Development

ryanNFT – Junior Solidity Development

Summary

The DAO can provide the Decentraland community a free source of user data via API for up to one year for the cost of $57,500. The existence of this data set will help to grow the builder and entrepreneurial community by providing important metrics needed to win clients and funding, and prevent monopolization by private entities. Atlas CORP is the suitable candidate for this development given an extensive history in Decentraland data collection and derived analytics.