# Data Agent

At Gata, DataAgent serves as a tangible proof-of-concept. It aggregates compute for large-scale AI inference, currently operating in a fully parallelized manner without inter-node collaboration.

DVA (Data Validation Agent) is Gata's first DataAgent.&#x20;

It evaluates the quality of image-caption data across the entire internet, assigning a score between -1 and 1. DVA scores are used to identify and select the highest-quality data points from the internet pool, which can then be used to pre-train various vision-language Als, such as stable diffusion, Dall-E, and GPT-4o.<br>

<figure><img src="/files/4AkfccrzoGVOniUbfdRR" alt=""><figcaption><p>Gata DVA Platform</p></figcaption></figure>

{% hint style="info" %}
Want to try Gata DVA? Head to the [DataAgent](https://app.gata.xyz/dataAgent) to start.
{% endhint %}

### DVA Mechanism

* Traditional data labeling is human-intensive:

Conventionally, training AI models has required humans to spend countless hours manually labeling data — a slow, expensive, and unscalable process.

* The DataAgent approach:

Instead of relying on human labeling, DataAgent leverages AI to automate the entire data generation pipeline. Users can simply contribute their idle compute through the browser and participate in crowdsourcing high-quality AI data.

* Why AI labeling over human labeling?

AI-generated data is the future. As AI systems approach and surpass human-level intelligence, the most valuable training data will increasingly be produced by AI itself.

* Cost-effective and scalable:

AI-driven data generation is dramatically more efficient and scalable than human labeling, making it a necessary foundation for the next wave of AI development.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://gata-1.gitbook.io/gata-docs/product/quickstart.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
