# Data Dictionary: Docker Hub Repositories

Dictionary contains explanations and examples of all data fields available in the **Docker Hub Repositories** dataset.

{% hint style="info" %}
All personal/company information mentioned within this context is entirely fictional and is solely intended for illustrative purposes.
{% endhint %}

{% tabs %}
{% tab title="Data fields per category" %}

1. [Metadata](#metadata)
2. [Repository publisher](#repository-publisher)
3. [Repository details](#repository-details)
   {% endtab %}
   {% endtabs %}

{% hint style="info" %}
The data fields in the example snippets have been rearranged for better grouping. To see where a specific data field stands, check the full data sample [here](/additional-sources/docker-hub/docker-hub-repositories/data-sample.md).
{% endhint %}

***

## Metadata

### Record metadata

| Data field             | Description                                          | Data type                   |
| ---------------------- | ---------------------------------------------------- | --------------------------- |
| `meta`                 | Contains information about the record                | Object                      |
| `created_at_date`      | The date when we first scraped the record            | Array of numbers (integers) |
| `created_at_timestamp` | The date we first scraped the record (Unix time)     | Float                       |
| `updated_at_date`      | The date when we last scraped the record             | Array of numbers (integers) |
| `updated_at_timestamp` | The date when we last scraped the record (Unix time) | Float                       |
| `version_id`           | Dataset version ID                                   | String                      |
| `source`               | The record source                                    | String                      |
| `object`               | The data object/entity                               | String                      |
| `is_deleted`           | Marks if the record available on Docker Hub          | Boolean                     |

**See a snippet of the dataset for reference:**

{% code title="Metadata" %}

```json
		"_meta": {
			"created_at_date": [
				2023,
				5,
				26
			],
			"created_at_timestamp": 1685105728.013557,
			"updated_at_date": [
				2024,
				5,
				1
			],
			"updated_at_timestamp": 1714554186.669536,
			"version_id": "a1efb819",
			"source": "dockerhub",
			"object": "repository",
			"is_deleted": false
		},
```

{% endcode %}

### Repository metadata

| Data field     | Description                                                         | Data type |
| -------------- | ------------------------------------------------------------------- | --------- |
| `doc`          | Dataset starting point                                              | Object    |
| `source_id`    | Repository identifier on Docker Hub                                 | String    |
| `id`           | Record identifier in our database                                   | String    |
| `last_updated` | Timestamp when the repository was last updated in `ISO 8601` format | String    |

**See a snippet of the dataset for reference:**

{% code title="Metadata" %}

```json
"doc": {
	"id": "dockerhub_repository_example_repository_dev",
	"source_id": "example_repository_dev",
	"last_updated": "2022-06-09T08:32:05.75699Z",
```

{% endcode %}

## Repository publisher

| Data field  | Description                   | Data type |
| ----------- | ----------------------------- | --------- |
| `publisher` | Publisher's name              | String    |
| `name`      | Repository title              | String    |
| `hub_user`  | User tied with the repository | String    |
| `namespace` | Associated developer          | String    |

**See a snippet of the dataset for reference:**

{% code title="Publisher" %}

```json
"publisher": "dev",
"name": "example-repository",
"hub_user": "dev",
"namespace": "dev",
```

{% endcode %}

## Repository details

| Data field         | Description                                                                                 | Data type       |
| ------------------ | ------------------------------------------------------------------------------------------- | --------------- |
| `url`              | Repository URL                                                                              | String          |
| `repository_type`  | Repository type                                                                             | String          |
| `is_automated`     | Indicates if the repository automatically updates the Docker image version                  | Boolean         |
| `status`           | Repository status                                                                           | Integer/boolean |
| `description`      | Repository description                                                                      | String          |
| `full_description` | <p>Full repository description</p><p><strong>Note:</strong> contains control characters</p> | String          |

**See a snippet of the dataset for reference:**

{% code title="Repository details" %}

```json
"url": "https://hub.docker.com/r/example_repository/dev"
"repository_type": "image",
"is_automated": false,
"status": 1,
"description": "Example repository description",
"full_description": "Full example repository description",
```

{% endcode %}

### Statistics

| Data field           | Description                                 | Data type |
| -------------------- | ------------------------------------------- | --------- |
| `star_count`         | Number of stars the repository has received | Integer   |
| `pull_count`         | Number of repository downloads              | Integer   |
| `collaborator_count` | Number of collaborators for the repository  | Integer   |

**See a snippet of the dataset for reference:**

{% code title="Statistics" %}

```json
"star_count": 0,
"pull_count": 2,
"collaborator_count": 0,
```

{% endcode %}

### Permissions

| Data field    | Description                                                      | Data type |
| ------------- | ---------------------------------------------------------------- | --------- |
| `permissions` | Permissions in the repository                                    | Object    |
| `read`        | Indicates if the `read` permission is enabled in the repository  | Boolean   |
| `write`       | Indicates if the `write` permission is enabled in the repository | Boolean   |
| `admin`       | Indicates if the `admin` permission is enabled in the repository | Boolean   |

**See a snippet of the dataset for reference:**

{% code title="Permissions" %}

```json
"permissions": {
            "read": true,
            "write": false,
            "admin": false
          },
```

{% endcode %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.coresignal.com/additional-sources/docker-hub/docker-hub-repositories/data-dictionary.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
