Community & Repository Data
Docker Hub Repositories

Data Dictionary

12min

Overview

Contains explanations and examples of all data fields available in the DockerHub Repositories dataset.

All personal/company information mentioned within this context is entirely fictional and is solely intended for illustrative purposes.

Data points per category

The data points in the example snippets have been rearranged for better grouping. To see where a specific data point stands, check the full data sample here.



Metadata

Record metadata

Data point

Description

Data type

meta

Contains information about the record

Object

created_at_date

The date when we first scraped the record

Array of numbers (integers)

created_at_timestamp

The date we first scraped the record (Unix time)

Float

updated_at_date

The date when we last scraped the record

Array of numbers (integers)

updated_at_timestamp

The date when we last scraped the record (Unix time)

Float

version_id

Dataset version ID

String

source

The record source

String

object

The data object/entity

String

is_deleted

Marks if the record available on Docker Hub

Boolean

See a snippet of the dataset for reference:

Metadata


Repository metadata

Data point

Description

Data type

doc

Dataset starting point

Object

source_id

Repository identifier on Docker Hub

String

id

Record identifier in our database

String

last_updated

Timestamp when the repository was last updated in ISO 8601 format

String

See a snippet of the dataset for reference:

Metadata


Repository publisher

Data point

Description

Data type

publisher

Publisher's name

String

name

Repository title

String

hub_user

User tied with the repository

String

namespace

Associated developer

String

See a snippet of the dataset for reference:

Publisher


Repository details

Data point

Description

Data type

url

Repository URL

String

repository_type

Repository type

String

is_automated

Indicates if the repository automatically updates the Docker image version

Boolean

status

Repository status

Integer/boolean

description

Repository description

String

full_description

Full repository description Note: contains control characters

String

See a snippet of the dataset for reference:

Repository details


Statistics

Data type

Description

Data type

star_count

Number of stars the repository has received

Integer

pull_count

Number of repository downloads

Integer

collaborator_count

Number of collaborators for the repository

Integer

See a snippet of the dataset for reference:

Statistics


Permissions

Data type

Description

Data type

permissions

Permissions in the repository

Object

read

Indicates if the read permission is enabled in the repository

Boolean

write

Indicates if the write permission is enabled in the repository

Boolean

admin

Indicates if the admin permission is enabled in the repository

Boolean

See a snippet of the dataset for reference:

Permissions