Data Dictionaries

⌘K
Overview
Professional Network Data Dictionaries
Employee Data
Firmographic Data
Job Posting Data
Capterra Data Dictionaries
Firmographic Data Dictionary
Review Data Dictionary
Docker Hub Data Dictionaries
Docker Hub Members Data Dictionary
Docker Hub Repositories Data Dictionary
GitHub Data Dictionary
G2 Data Dictionaries
Glassdoor Data Dictionaries
Firmographic Data Dictionary
Reviews Data Dictionary
Salaries Data Dictionary
Jobs Data Dictionary
Indeed Data Dictionaries
Firmographic Data Dictionary
Jobs Data Dictionary
Owler Data Dictionary
Wellfound Data Dictionaries
Employee Data Dictionary
Firmographic Data Dictionary
Jobs Data Dictionary
Docs powered by Archbee

GitHub Data Dictionary

48min

Contains explanations and examples for all the data fields available in the GitHub dataset.

🎯 Data index

Meta fields

Meta fields are included at the beginning of the dataset. Contains the information on the dataset version, when the dataset was created and updated.

Developer information

Contains developer's information points (name, contact information, affiliation, following and authored gists, and repos).

Developer's repositories

Contains information on the developer's repositories, such as programming languages used in the repo, repo owner, and license.

Starred repositories

Contains information on the repositories the developer has starred, such as programming languages used in the repo, repo owner, and license.

Developer's subscriptions

Contains information on the repositories the developer has subscribed to, such as programming languages used in the repo, repo owner, and license.

Organizations

Contains information on the organization the developer is affiliated with.

Followers

Contains information on the developer's followers and who the developer is following.

Data points in the example snippets are rearranged for better grouping. To see where a specific data point stands, check the full data sample below:

Full sample

Full sample
{
	"_meta": {
		"source": "github",
		"object": "user",
		"created_at_date": [
			2021,
			9,
			20
		],
		"created_at_timestamp": 1632112810.959514,
		"updated_at_date": [
			2022,
			11,
			22
		],
		"åç": 1669141390.862272,
		"version_id": "e0f2c272"
	},
	"doc": {
		"source_id": 69642661,
		"id": "github_people_69642661",
		"image": "https://avatars.githubusercontent.com/u/40758443?v=4",
		"bio": null,
		"contact_info": {
			"blog": "",
			"twitter": null
		},
		"company": null,
		"events_url": "https://api.github.com/users/john-doe/events{/privacy}",
		"follower_count": 14,
		"following_count": 28,
		"hireable": null,
		"url": "https://github.com/john-doe",
		"location": null,
		"username": "john doe",
		"name": "John Doe",
		"node_id": "MDQ6VXNlcjY5NjQyNjYx",
		"public_gist_count": 0,
		"public_repo_count": 9,
		"starred_repos_count": 70,
		"site_admin": false,
		"type": "User",
		"repo": [
			{
				"disabled": false,
				"archived": false,
				"created_at": "2020-12-13T10:59:42Z",
				"default_branch": "main",
				"description": "A  progresive web app (PWA) which utilizes whitespaces to make text invisible",
				"fork": true,
				"fork_count": 0,
				"forked_from": "https://www.github.com/john-doe-software",
				"has_downloads": true,
				"has_issues": false,
				"has_pages": false,
				"has_projects": true,
				"has_wiki": true,
				"website": "https://crm-software.io/",
				"url": "https://github.com/john-doe/software",
				"source_id": 321043607,
				"language": null,
				"languages_distribution": {
					"JavaScript": 88.1,
					"HTML": 11.9
				},
				"license": {
					"key": "mit",
					"name": "MIT License",
					"spdx_id": "MIT",
					"url": "https://api.github.com/licenses/mit",
					"node_id": "MDc6TGljZW5zZTEz"
				},
				"repo_name": "Software",
				"repo_owner": "developer8",
				"name": "Software",
				"node_id": "MDEwOlJlcG9zaXRvcnkzMjEwNDM2MDc=",
				"owner": {
					"image": "https://avatars.githubusercontent.com/u/95486833?v=4",
					"url": "https://github.com/john-doe",
					"source_id": 95486833,
					"username": "john doe",
					"node_id": "MDQ6VXNlcjY5NjQyNjYx",
					"site_admin": false,
					"type": "User"
				},
				"open_issues_count": 0,
				"pushed_at": "2020-12-09T17:34:22Z",
				"size": 1070,
				"stargazer_count": 0,
				"updated_at": "2020-12-13T10:59:42Z",
				"watcher_count": 0,
				"topics": []
			},
			{
				"disabled": false,
				"archived": false,
				"created_at": "2020-09-01T14:41:55Z",
				"default_branch": "master",
				"description": "A simple game android app using Java",
				"fork": false,
				"fork_count": 0,
				"forked_from": null,
				"has_downloads": true,
				"has_issues": true,
				"has_pages": false,
				"has_projects": true,
				"has_wiki": true,
				"website": null,
				"url": "https://github.com/john-doe/game",
				"source_id": 292023912,
				"language": "Java",
				"languages_distribution": {
					"Java": 100.0
				},
				"license": {},
				"repo_name": "Game",
				"repo_owner": "john doe",
				"name": "Game",
				"node_id": "MDEwOlJlcG9zaXRvcnkyOTIwMjM5MTI=",
				"owner": {
					"image": "https://avatars.githubusercontent.com/u/95486833?v=4",
					"url": "https://github.com/john-doe",
					"source_id": 95486833,
					"username": "john doe",
					"node_id": "MDQ6VXNlcjY5NjQyNjYx",
					"site_admin": false,
					"type": "User"
				},
				"open_issues_count": 0,
				"pushed_at": "2020-09-01T18:05:15Z",
				"size": 137,
				"stargazer_count": 1,
				"updated_at": "2020-09-01T14:41:55Z",
				"watcher_count": 1,
				"topics": []
			},
			{
				"disabled": false,
				"archived": false,
				"created_at": "2020-12-13T14:20:00Z",
				"default_branch": "master",
				"description": "A simple android app to calculate tip for a service.",
				"fork": false,
				"fork_count": 0,
				"forked_from": null,
				"has_downloads": true,
				"has_issues": true,
				"has_pages": false,
				"has_projects": true,
				"has_wiki": true,
				"website": null,
				"url": "https://github.com/john-doe/Tips",
				"source_id": 321082187,
				"language": "Kotlin",
				"languages_distribution": {
					"Kotlin": 100.0
				},
				"license": {},
				"repo_name": "Tips",
				"repo_owner": "john doe",
				"name": "Tips",
				"node_id": "MDEwOlJlcG9zaXRvcnkzMjEwODIxODc=",
				"owner": {
					"image": "https://avatars.githubusercontent.com/u/95486833?v=4",
					"url": "https://github.com/john-doe",
					"source_id": 95486833,
					"username": "john doe",
					"node_id": "MDQ6VXNlcjY5NjQyNjYx",
					"site_admin": false,
					"type": "User"
				},
				"open_issues_count": 0,
				"pushed_at": "2020-12-13T14:43:22Z",
				"size": 146,
				"stargazer_count": 0,
				"updated_at": "2020-12-13T14:20:00Z",
				"watcher_count": 0,
				"topics": []
			}
		],
		"starred": [
			{
				"disabled": false,
				"archived": false,
				"created_at": "2022-06-29T10:06:04Z",
				"default_branch": "main",
				"description": "Lab experiment scheme",
				"fork": false,
				"fork_count": 0,
				"forked_from": null,
				"has_downloads": true,
				"has_issues": true,
				"has_pages": false,
				"has_projects": true,
				"has_wiki": true,
				"website": "",
				"url": "https://github.com/starred_developer/Networking-Lab",
				"source_id": 508638480,
				"language": "C",
				"languages_distribution": {
					"C": 100.0
				},
				"license": {},
				"repo_name": "Networking-Lab",
				"repo_owner": "starred_developer",
				"name": "Networking-Lab",
				"node_id": "R_kgDOHlE1EA",
				"owner": {
					"image": "https://avatars.githubusercontent.com/u/95486833?v=4",
					"url": "https://github.com/starred_developer",
					"source_id": 95486833,
					"username": "starred_developer",
					"node_id": "MDQ6VXNlcjY0MjM4OTgz",
					"site_admin": false,
					"type": "User"
				},
				"open_issues_count": 0,
				"pushed_at": "2022-07-28T06:23:19Z",
				"size": 67,
				"stargazer_count": 1,
				"updated_at": "2022-06-29T10:06:04Z",
				"watcher_count": 1,
				"topics": []
			},
			{
				"disabled": false,
				"archived": false,
				"created_at": "2020-06-19T20:12:58Z",
				"default_branch": "source",
				"description": "Algorithms on graphs displayed in a pretty manner",
				"fork": false,
				"fork_count": 0,
				"forked_from": null,
				"has_downloads": true,
				"has_issues": true,
				"has_pages": true,
				"has_projects": true,
				"has_wiki": true,
				"website": "https://starred_developer.github.io/PrettyAlgorithms/",
				"url": "https://github.com/starred_developer/PrettyAlgorithms",
				"source_id": 273578437,
				"language": "TypeScript",
				"languages_distribution": {
					"TypeScript": 80.7,
					"SCSS": 12.7,
					"HTML": 5.4,
					"JavaScript": 1.3
				},
				"license": {
					"key": "mit",
					"name": "MIT License",
					"spdx_id": "MIT",
					"url": "https://api.github.com/licenses/mit",
					"node_id": "MDc6TGljZW5zZTEz"
				},
				"repo_name": "pretty-algorithms",
				"repo_owner": "starred_developer",
				"name": "pretty-graph-algorithms",
				"node_id": "MDEwOlJlcG9zaXRvcnkyNzM1Nzg0Mzc=",
				"owner": {
					"image": "https://avatars.githubusercontent.com/u/81438374?v=4",
					"url": "https://github.com/starred_developer",
					"source_id": 81438374,
					"username": "starred_developer",
					"node_id": "MDQ6VXNlcjQxODc4ODQy",
					"site_admin": false,
					"type": "User"
				},
				"open_issues_count": 5,
				"pushed_at": "2022-11-13T20:45:04Z",
				"size": 490,
				"stargazer_count": 6,
				"updated_at": "2020-06-19T20:12:58Z",
				"watcher_count": 6,
				"topics": [
					"algorithms",
					"computer-science",
					"cs",
					"css",
					"graph-algorithms",
					"graphs",
					"html",
					"javascript",
					"learning",
					"simple",
					"visual",
					"visual-representation",
					"web"
				]
			}
		],
		"subscription": [
			{
				"disabled": false,
				"archived": false,
				"created_at": "2020-09-01T14:41:55Z",
				"default_branch": "master",
				"description": "A simple game android app using Java",
				"fork": false,
				"fork_count": 0,
				"forked_from": null,
				"has_downloads": true,
				"has_issues": true,
				"has_pages": false,
				"has_projects": true,
				"has_wiki": true,
				"website": null,
				"url": "https://github.com/john-doe/game",
				"source_id": 292023912,
				"language": "Java",
				"languages_distribution": {
					"Java": 100.0
				},
				"license": {},
				"repo_name": "game",
				"repo_owner": "john doe",
				"name": "game",
				"node_id": "MDEwOlJlcG9zaXRvcnkyOTIwMjM5MTI=",
				"owner": {
					"image": "https://avatars.githubusercontent.com/u/95486833?v=4",
					"url": "https://github.com/john-doe",
					"source_id": 95486833,
					"username": "john doe",
					"node_id": "MDQ6VXNlcjY5NjQyNjYx",
					"site_admin": false,
					"type": "User"
				},
				"open_issues_count": 0,
				"pushed_at": "2020-09-01T18:05:15Z",
				"size": 137,
				"stargazer_count": 1,
				"updated_at": "2020-09-01T14:41:55Z",
				"watcher_count": 1,
				"topics": []
			},
			{
				"disabled": false,
				"archived": false,
				"created_at": "2020-12-13T14:20:00Z",
				"default_branch": "master",
				"description": "A simple android app to calculate tip for a service.",
				"fork": false,
				"fork_count": 0,
				"forked_from": null,
				"has_downloads": true,
				"has_issues": true,
				"has_pages": false,
				"has_projects": true,
				"has_wiki": true,
				"website": null,
				"url": "https://github.com/john-doe/Tips",
				"source_id": 321082187,
				"language": "Kotlin",
				"languages_distribution": {
					"Kotlin": 100.0
				},
				"license": {},
				"repo_name": "Tips",
				"repo_owner": "john doe",
				"name": "Tips",
				"node_id": "MDEwOlJlcG9zaXRvcnkzMjEwODIxODc=",
				"owner": {
					"image": "https://avatars.githubusercontent.com/u/95486833?v=4",
					"url": "https://github.com/john-doe",
					"source_id": 95486833,
					"username": "john doe",
					"node_id": "MDQ6VXNlcjY5NjQyNjYx",
					"site_admin": false,
					"type": "User"
				},
				"open_issues_count": 0,
				"pushed_at": "2020-12-13T14:43:22Z",
				"size": 146,
				"stargazer_count": 0,
				"updated_at": "2020-12-13T14:20:00Z",
				"watcher_count": 0,
				"topics": []
			}
		],
		"organization": [
			{
				"description": null,
				"source_id": 70442962,
				"username": "developer-org",
				"node_id": "MDEyOk9yZ2FuaXphdGlvbjcwNDQyOTYy",
				"url": "https://api.github.com/orgs/developer-org"
			}
		],
		"followed_by": [
			{
				"username": "Brett Boe",
				"source_id": 7036736,
				"url": "https://github.com/brett-boe"
			},
			{
				"username": "Java Developer",
				"source_id": 36053609,
				"url": "https://github.com/java-developer"
			},
			{
				"username": "Python Developer",
				"source_id": 45203520,
				"url": "https://github.com/python-developer"
			}
		],
		"is_following": [
			{
				"username": "following-dev",
				"source_id": 956605,
				"url": "https://github.com/following-dev"
			},
			{
				"username": "following-dev1234",
				"source_id": 193838,
				"url": "https://github.com/following-dev"
			}
		]
	}
}


Data fields and explanations

Meta fields

Data point

Description

Data type

Example values

meta

Contains information about the record





source

The record source

string

github

object

The data object/entity

string

user

created_at_date

The date when we first scraped the record

array of numbers

2021, 9, 13

created_at_timestamp

The date we first scraped the record (Unix time)

number

1631498987.348509

updated_at_date

The date when we last scraped the record

array of numbers

2022, 12, 2

version_id

Dataset version ID

string

1669994154.952415

updated_at_timestamp

The date when we last scraped the record (Unix time)

number

e0f2c272

See a snippet of the dataset for reference:

Meta fields
"_meta": {
			"source": "github",
			"object": "user",
			"created_at_date": [
				2021,
				9,
				13
			],
			"created_at_timestamp": 1631498987.348509,
			"updated_at_date": [
				2022,
				12,
				2
			],
			"updated_at_timestamp": 1669994154.952415,
			"version_id": "e0f2c272"
		},


Null value means that the information was not available on GitHub.

Developer information

Metadata

Data point

Description

Data type

Example values

doc

Start of the dataset: contains the first set of information points about the company

object



source_id

Unique identifier of the record on GitHub

string

5b24276186ed43b1aaad5624bac02cd9

id

Unique identifier of GitHub record in our database

string

github_people_7081362

site_admin

Marks if the user is the site admin

boolean

false

type

Marks the entity type (repository owner)

string

User

See snippets of the dataset for reference:

Dev info- metadata pt. 1
Dev info - metadata pt. 2
"site_admin": false,
"type": "User",


Data point

Description

Data type

Example values

events_url

GitHub REST API response

string

https://api.github.com/users/ngav/events{/privacy}

node_id

ID assigned to objects by GitHub REST API

string

MDQ6VXNlcjEwMDI5MDY5

See snippets of the dataset for reference:

events_url
node_id
"events_url": "https://api.github.com/users/ngav/events{/privacy}",


Profile information

Data point

Description

Data type

Example values

image

Developer's avatar/logo

string

https://avatars.githubusercontent.com/u/. . .

bio

Developer's bio

Note: contains control characters

string

I'm just a random dude. Don't mind me.\r\n\r\nDeveloper at imec PreDiCT

url

Developer's GitHub profile

string

https://github.com/john-doe

location

Developer's location

string

indonesia

See snippets of the dataset for reference:

Profile info
url
location
"location": "indonesia"


Name and username

Data point

Description

Data type

Example values

username

Developer's username

string

john-doe

name

Developer's name Note: not necessarily the same as the username

string

john-doe

See a snippet of the dataset for reference:

Name and username
"username": "john-doe",
"name": "john-doe",


Contact information

Data point

Description

Data type

Example values

contact_info

Contains the developer's publicly accessible contact information

object



blog

Developer's blog

string

https://john-doe.be

twitter

Developer's Twitter handle

string

john-doe

See a snippet of the dataset for reference:

Contact information
"contact_info": {
        "blog": "https://john-doe.be",
        "twitter": "john-doe"
      },


Affiliation

Data point

Description

Data type

Example values

company

Company the user has listed on their profile

string

SJTU

hireable

Marks if the developer is hireable Note: Users select the option in their settings. Information can be retrieved by using the GitHub REST API.

-

null / true

See snippets of the dataset for reference:

Company
hireable: null
hireable: true
"company": "SJTU",


Following

Data point

Description

Data type

Example values

follower_count

Developer's follower count

number

14

following_count

The number of people the developer follows

number

28

See a snippet of the dataset for reference:

Following
"follower_count": 14,
"following_count": 28,


Gists and repos

Data point

Description

Data type

Example values

public_gist_count

The number of gists by the developer

number

0

public_repo_count

The number of repositories owned by the developer

number

2

See a snippet of the dataset for reference:

Gists and repos
"public_gist_count": 0,
"public_repo_count": 2,


Developer's repositories

Data point

Description

Data type

Example values

repo

Contains information on the developer's repositories

array of objects



disabled

Marks if the repository was disabled when we last scraped it

boolean

false

archived

Shows if the repository is archived and no longer accessible

boolean

false

created_at

Time and date when the repository was created

string

2022-10-14T10:06:59Z

default_branch

Title of the repository default branch

string

main

description

Repository description

Note: may contain control characters

string

null

fork

Marks if the repository in a record is a copy of another repository

boolean

false

fork_count

The number of repository copies

number

0

forked_from

The original repository the copy has been made from

string

null

has_downloads

Shows if other users have downloaded the repository

boolean

true

has_issues

Marks if the repository has the issues section enabled

boolean

true

has_pages

Marks if the repository has the pages section enabled

boolean

false

has_projects

Marks if the repository has the projects section enabled

boolean

true

has_wiki

Shows if the repository has a wiki included

boolean

true

website

Project website

string

null

url

Repository GitHub page

string

https://github.com/john-doe/software

source_id

Unique identifier of the record on GitHub

number

551391535

See a snippet of the dataset for reference:

Developer's repositories
"repo": [ 
      {
	   "disabled": false,
	   "archived": false,
	   "created_at": "2022-10-14T10:06:59Z",
	   "default_branch": "main",
	   "description": null,
	   "fork": false,
	   "fork_count": 0,
	   "forked_from": null,
	   "has_downloads": true,
	   "has_issues": true,
	   "has_pages": false,
	   "has_projects": true,
	   "has_wiki": true,
	   "website": null,
	   "url": "https://github.com/john-doe/software",
	   "source_id": 551391535,


Data point

Description

Data type

Example values

open_issues_count

The number of open issues in the repository

number

47

pushed_at

Time and date the repository was published

string

2022-11-01T18:21:42Z

size

Repository size in MB

number

15938

stargazer_count

The number of people who have starred the repository

number

7249

updated_at

Time and date the repository was last updated

string

2021-05-18T03:32:01Z

watcher_count

The number of people who are following the repository updates

number

7249

topics

Topics covered in the repository

array of strings

v2-ui

See a snippet of the dataset for reference:

Developer's repositories
"open_issues_count": 47,
"pushed_at": "2022-11-01T18:21:42Z",
"size": 15938,
"stargazer_count": 7249,
"updated_at": "2021-05-18T03:32:01Z",
"watcher_count": 7249,
"topics":[
	      "v2-ui",
	      "x-ui",
	      "xray",
	      "xray-core",
	      "xray-panel"
			]


Programming languages

Data point

Description

Data type

Example values

language

The main programming language in the repository

string

JavaScript

languages_distribution

Languages and their distribution in the repository by percentage

object

JavaScript: 58.2 Vue: 37.9

See a snippet of the dataset for reference:

Programming languages
"language": "JavaScript",
"languages_distribution": {
						   "JavaScript": 58.2,
						   "Vue": 37.9,
						   "SCSS": 3.0,
				 		   "HTML": 0.9
				         },


Repository owner

Data point

Description

Data type

Example values

repo_name

Repository title

string

software

repo_owner

Repository owner's username

string

dev

name

Name of the data entity in the record (repository)

string

software

node_id

ID assigned to objects by GitHub REST API

string

R_kgDOIN2RLw

See a snippet of the dataset for reference:

Repository owner
"repo_name": "software",
"repo_owner": "dev",
"name": "software",
"node_id": "R_kgDOIN2RLw",


License

Data point

Description

Data type

Example values

license

Contains the information on the open-source licenses the repository uses

object



key

Part of the Github URL identifying license

string

mit

name

License name

string

MIT License

spdx_id

Spdx license ID

string

MIT

url

URL redirecting to Github info on licensing

string

https://api.github.com/licenses/mit

node_id

ID assigned to objects by GitHub REST API

string

MDc6TGljZW5zZTEz

See a snippet of the dataset for reference:

License
"license": {
					"key": "mit",
					"name": "MIT License",
					"spdx_id": "MIT",
					"url": "https://api.github.com/licenses/mit",
					"node_id": "MDc6TGljZW5zZTEz"
					 },


Developer information

Data point

Description

Data type

Example values

owner

Contains information on the repository developer

object



image

Developer's logo/avatar

string

https://avatars.githubusercontent.com/u/. . .

url

Developer's profile

string

https://github.com/john-doe

source_id

Unique identifier of the record on GitHub

number

47310637

username

Developer's username

string

dev

node_id

ID assigned to objects by GitHub REST API

string

MDQ6VXNlcjQ3MzEwNjM3

site_admin

Marks if the user is the site admin

boolean

false

type

Marks the entity type (repository owner)

string

User

See a snippet of the dataset for reference:

Developer information
"owner": {
		"image": "https://avatars.githubusercontent.com/u/35107824?v=4",
		"url": "https://github.com/john-doe",
		"source_id": 47310637,
		"username": "dev",
		"node_id": "MDQ6VXNlcjQ3MzEwNjM3",
		"site_admin": false,
		"type": "User"
		},


Starred repositories

Data point

Description

Data type

Example values

starred

Contains information on the repositories the developer starred

array of objects



disabled

Marks if the repository was disabled when we last scraped it

boolean

false

archived

Shows if the repository is archived and no longer accessible

boolean

false

created_at

Time and date when the repository was created

string

2022-10-14T10:06:59Z

default_branch

Title of the repository default branch

string

master

description

Repository description

Note: may contain control characters

string

null

fork

Marks if the repository in a record is a copy of another repository

boolean

false

fork_count

The number of repository copies

number

362

forked_from

The original repository the copy has been made from

string

null

has_downloads

Shows if other users have downloaded the repository

boolean

true

has_issues

Marks if the repository has the issues section enabled

boolean

true

has_pages

Marks if the repository has the pages section enabled

boolean

false

has_projects

Marks if the repository has the projects section enabled

boolean

true

has_wiki

Shows if the repository has a wiki included

boolean

true

website

Project website

string

null

url

Repository GitHub page

string

https://github.com/john-doe/software

source_id

Unique identifier of the record on GitHub

number

551391535

See a snippet of the dataset for reference:

Starred repositories
"starred": [
	{
	 "disabled": false,
	 "archived": false,
	 "created_at": "2021-01-19T18:26:34Z",
	 "default_branch": "master",
	 "description": null,
	 "fork": false,
	 "fork_count": 362,
	 "forked_from": null,
	 "has_downloads": true,
     "has_issues": true,
	 "has_pages": false,
	 "has_projects": true,
	 "has_wiki": false,
	 "website": null,
	 "url": "https://github.com/dev/software",
	 "source_id": 331071860,


Data point

Description

Data type

Example values

open_issues_count

The number of open issues in the repository

number

47

pushed_at

Time and date the repository was published

string

2022-02-09T22:20:12Z

size

Repository size in MB

number

292

stargazer_count

The number of people who have starred the repository

number

321

updated_at

Time and date the repository was last updated

string

2021-01-19T18:26:34Z

watcher_count

The number of people who are following the repository updates

number

3217

topics

Topics covered in the repository

array of strings

android

See a snippet of the dataset for reference:

Starred repositories
Topics
"topics": [
	"android",
	"android-apps",
	"android-samples",
	"chat",
	"chat-app",
	"getstream",
	"kotlin",
	"real-time-chat"
	]


Programming languages

Data point

Description

Data type

Example values

language

The main programming language in the repository

string

Python

languages_distribution

Languages and their distribution in the repository by percentage

object

Python: 95.3

See a snippet of the dataset for reference:

Programming languages
"language": "Python",
"languages_distribution": {
						"Python": 95.3,
						"Starlark": 4.7
	                      },


Repository owner

Data type

Description

Data type

Example values

repo_name

Repository title

string

dev-software

repo_owner

Repository owner's username

string

dev

name

Name of the data entity in the record (repository)

string

dev-software

node_id

ID assigned to objects by GitHub REST API

string

MDEwOlJlcG9zaXRvcnkzNDIzNDM4NTE=

See a snippet of the dataset for reference:

Repository owner
"repo_name": "dev-software",
"repo_owner": "dev",
"name": "dev-software",
"node_id": "MDEwOlJlcG9zaXRvcnkzNDIzNDM4NTE="


License

Data point

Description

Data type

Example values

license

Contains the information on the open-source licenses the repository uses

object



key

Part of the Github URL identifying license

string

gpl-3.0

name

License name

string

GNU General Public License v3.0

spdx_id

Spdx license ID

string

GPL-3.0

url

URL redirecting to Github info on licensing

string

https://api.github.com/licenses/gpl-3.0

node_id

ID assigned to objects by GitHub REST API

string

MDc6TGljZW5zZTk=

See a snippet of the dataset for reference:

License
"license": {
			 "key": "gpl-3.0",
			 "name": "GNU General Public License v3.0",
			 "spdx_id": "GPL-3.0",
			 "url": "https://api.github.com/licenses/gpl-3.0",
			  "node_id": "MDc6TGljZW5zZTk="
		     },


Developer information

Data point

Description

Data type

Example values

owner

Contains information on the developer of the starred repository

object



image

Developer's logo/avatar

string

https://avatars.githubusercontent.com/u/. . .

url

Developer's profile

string

https://github.com/john-noakes

source_id

Unique identifier of the record on GitHub

number

8597527

username

Developer's username

string

john noakes

node_id

ID assigned to objects by GitHub REST API

string

MDEyOk9yZ2FuaXphdGlvbjg1OTc1Mjc

site_admin

Marks if the user is the site admin

boolean

false

type

Shows the entity type (repository owner)

string

Organization

See a snippet of the dataset for reference:

Developer information
"owner": {
		"image": "https://avatars.githubusercontent.com/u/8597527?v=4",
		"url": "https://github.com/john-noakes",
		"source_id": 8597527,
		"username": "john noakes",
		"node_id": "MDEyOk9yZ2FuaXphdGlvbjg1OTc1Mjc=",
		"site_admin": false,
		"type": "Organization"
	 },


Developer's subscriptions

Data type

Description

Data type

Example values

subscription

Contains information on the repositories the developer subscribes to

array of objects



disabled

Marks if the repository was disabled when we last scraped it

boolean

false

archived

Shows if the repository is archived and no longer accessible

boolean

false

created_at

Time and date when the repository was created

string

2021-02-25T18:40:15Z

default_branch

Title of the repository default branch

string

master

description

Repository description Note: may contain control characters

string

null

fork

Marks if the repository in a record is a copy of another repository

boolean

false

fork_count

The number of repository copies

number

0

forked_from

The original repository the copy has been made from

string

null

has_downloads

Shows if other users have downloaded the repository

boolean

true

has_issues

Marks if the repository has the issues section enabled

boolean

true

has_pages

Marks if the repository has the pages section enabled

boolean

true

has_projects

Marks if the repository has the projects section enabled

boolean

true

has_wiki

Shows if the repository has a wiki included

boolean

true

website

Project website

string

null

url

Repository GitHub page

string

https://github.com/john-stiles/software

source_id

Unique identifier of the record on GitHub

number

342343851

See a snippet of the dataset for reference:

Developer's subscriptions
"subscription": [
			{
			"disabled": false,
			"archived": false,
			"created_at": "2021-02-25T18:40:15Z",
			"default_branch": "master",
			"description": null,
			"fork": false,
			"fork_count": 0,
			"forked_from": null,
			"has_downloads": true,
			"has_issues": true,
			"has_pages": true,
			"has_projects": true,
			"has_wiki": true,
			"website": null,
			"url": "https://github.com/john-stiles/software",
			"source_id": 342343851,


Data point

Description

Data type

Example values

open_issues_count

The number of open issues in the repository

number

0

pushed_at

Time and date the repository was published

string

2021-03-17T14:55:44Z

size

Repository size in MB

number

388554

stargazer_count

The number of people who have starred the repository

number

0

updated_at

Time and date the repository was last updated

string

2021-03-17T14:55:02Z

watcher_count

The number of people who are following the repository updates

number

0

topics

Topics covered in the repository

array of strings

public-api

See a snippet of the dataset for reference:

Developer's subscriptions
topics
"topics": [
		"api",
		"apis",
		"dataset",
		"development",
		"free",
		"list",
		"lists",
		"open-source",
		"public",
		"public-api",
		"public-apis",
		"resources",
		"software"
	]


Programming languages

Data point

Description

Data type

Example values

language

The main programming language in the repository

string

JavaScript

languages_distribution

Contains languages and their distribution in the repository by percentage

object

JavaScript: 99.9 HTML: 0.1 CSS: 0.0

See a snippet of the dataset for reference:

language
"language": "JavaScript",
"languages_distribution": {
						"JavaScript": 99.9,
						"HTML": 0.1,
						"CSS": 0.0
					      },


Repository owner

Data point

Description

Data type

Example values

repo_name

Repository title

string

python.dev.repo

repo_owner

Repository owner's username

string

Python dev

name

Name of the data entity in the record (repository)

string

python.dev.repo

node_id

ID assigned to objects by GitHub REST API

string

MDEwOlJlcG9zaXRvcnkzNDg3NDg2MzA=

See a snippet of the dataset for reference:

Repository owner
"repo_name": "python.dev.repo",
"repo_owner": "Python dev",
"name": "python.dev.repo",
"node_id": "MDEwOlJlcG9zaXRvcnkzNDg3NDg2MzA=",


License

Data point

Description

Data type

Example values

license

Contains the information on the open-source licenses the repository uses

object



key

Part of the Github URL identifying license

string

ms-pl

name

License name

string

Microsoft Public License

spdx_id

Spdx license ID

string

MS-PL

url

URL redirecting to Github info on licensing

string

https://api.github.com/licenses/ms-pl

node_id

ID assigned to objects by GitHub REST API

string

MDc6TGljZW5zZTE5

See a snippet of the dataset for reference:

license
"license": {
            "key": "ms-pl",
            "name": "Microsoft Public License",
            "spdx_id": "MS-PL",
            "url": "https://api.github.com/licenses/ms-pl",
            "node_id": "MDc6TGljZW5zZTE5"
          },


Developer information

Data point

Description

Data type

Example values

owner

Contains information on the developer of the subscribed repository

object



image

Developer's logo/avatar

string

https://avatars.githubusercontent.com/u/. . .

url

Developer's profile

string

https://github.com/java-developer

source_id

Unique identifier of the record on GitHub

number

79530557

username

Developer's username

string

java developer

node_id

IDs assigned to objects while scraping in the GIT API

string

MDQ6VXNlcjc5NTMwNTU3

site_admin

ID assigned to objects by GitHub REST API

boolean

false

type

Marks the entity type (repository owner)

string

User

See a snippet of the dataset for reference:

Owner
"owner": {
		"image": "https://avatars.githubusercontent.com/u/2933505?v=4",
		"url": "https://github.com/javdeveloper",
		"source_id": 79530557,
		"username": "java developer",
		"node_id": "MDQ6VXNlcjc5NTMwNTU3",
		"site_admin": false,
		"type": "User"
		},


Organizations

Data point

Description

Data type

Example values

organization

Contains information on the organizations the developer is connected to

array of objects



description

Organization description Note: may contain control characters

string

null

source_id

Unique identifier of the record on GitHub

string

70442962

username

Organization name

string

dev-org

node_id

IDs assigned to objects while scraping in the GitHub REST API

string

MDEyOk9yZ2FuaXphdGlvbjcwNDQyOTYy

url

Information on the organization returned by the GitHub REST API

string

https://api.github.com/orgs/dev-org

See a snippet of the dataset for reference:

Organization
"organization": [
				  {
		"description": null,
		"source_id": 70442962,
		"username": "dev-org",
		"node_id": "MDEyOk9yZ2FuaXphdGlvbjcwNDQyOTYy",
		"url": "https://api.github.com/orgs/dev-org"
	    }
	],


Followers

Followers

Data point

Description

Data type

Example values

followed_by

Contains information on people who are following the developer

array of objects



username

Follower's username

string

follower-dev

source_id

Unique identifier of the record on GitHub

number

6514464

url

Follower's GitHub profile

string

https://github.com/folower-dev

See a snippet of the dataset for reference:

Followers
"followed_by": [
        {
          "username": "follower-dev",
          "source_id": 6514464,
          "url": "https://github.com/follower-dev"
        }
    ]


Following

Data point

Description

Data type

Example values

is_following

Contains information on the people the developer follows

array of objects



username

Followee's username

string

following-dev

source_id

Unique identifier of the record on GitHub

number

163421

url

Followee's GitHub profile

string

https://github.com/following-dev

See a snippet of the dataset for reference:

Following
"is_following": [
        {
          "username": "following-dev",
          "source_id": 163421,
          "url": "https://github.com/following-dev"
        }
  ]




PREVIOUS
Docker Hub Repositories Data Dictionary
NEXT