Dictionary: Clean Employee Data
Request access to our full documentation
This is a simplified version of our documentation. If you want to:
Access additional data samples
Learn more about our cleaning and enrichment process
Explore the complete list of data sources we offer
Contact our team and get access to more information:
Overview
Clean Employee Data provides high-quality, structured workforce data that is ready for immediate use. Our data is meticulously cleaned and enriched, enabling businesses to streamline operations, enhance decision-making, and optimize workforce analysis.
By leveraging Clean Employee Data, organizations can reduce engineering overhead, gain access to additional insights, and work with optimized data formats for improved efficiency. The data is available in JSONL, Parquet, and CSV formats, ensuring faster downloads and seamless integration.
With flexible retrieval options, including flat file downloads and API access, businesses in sales tech, HR intelligence, and investment sectors can efficiently access the workforce insights they need.
Clean Employee Data is derived from our Base Employee Data.
The data fields are separated into collections to visualize the data better.
All personal/company information mentioned within this context is entirely fictional and is solely intended for illustrative purposes.
Metadata
member_last_updated
Cleaned
Date the record was last updated
String
member_is_deleted
Raw
Indicates whether the profile was accessible:
1 – deleted or private
0 – publicly available
Integer
Cleaning actions
member_last_updated
Value is converted to the yyyy-mm-dd format.
Identifiers
member_id
Raw
Identification key in our database
Integer
member_websites_professional_network
Raw
Professional network profile URL
String
member_picture_url
Raw
Profile picture URL
String
member_full_name
Cleaned
Full name
String
member_name_first
Raw
First name
String
member_name_middle
Enriched
Middle name
String
member_name_last
Enriched
Last name
String
member_shorthand_names
Raw
A list of all historical employee shorthand names
Array of strings
member_follower_count
Raw
Number of profile followers
Integer
member_public_profile_id
Raw
Publicly provided employee URN
String
Cleaning actions
member_full_name
Special characters/emojis are removed;
Any words that follow a comma or are in parentheses are removed;
Titles (preceding or following the name) are removed.
member_name_middle
Parsed from member_full_name.
member_name_last
Parsed from member_full_name
Skills
member_skills
Enriched
List of employees' skills
Array of strings
Enriching action
member_skills
Enriched with our ML model from different description fields.
Experience
member_description
Raw
Job position description
String
company_id
Enriched
Identification key for the company associated with the employee's experience
Integer
member_job_title
Cleaned
Current job position title
String
is_decision_maker
Enriched
Indicates whether the employee is a decision-maker based on member_job_title
1 – Employee is marked as a decision-maker in the current role
0 – Employee is not marked as a decision-maker in the current role
Integer
member_job_description
Raw
Current job position description
String
member_headline
Raw
Job title found in the profile headline
String
member_generated_headline
Raw
A user-written headline that can be found in web search, also viewed and other publicly available spaces.
It serves the same purpose as the title but is derived from a different source, potentially providing more accurate and up-to-date profile information.
This field should be used in place title as it reflects the latest user activity.
String
total_experience_duration
Enriched
Summed up experience (displayed as years and months)
String
total_experience_duration_months
Enriched
Summed up employee experience (displayed as months)
Integer
Cleaning and enriching actions
company_id
Company ID from an active experience record from member_experience.
job_title
Special characters are removed.
total_experience_duration
Values converted to readable text.
total_experience_duration_months
Field aggregated from durationvalues.
The member_experience table is mapped with our historical data due to professional network hiding the work experience on certain employees' profiles.
member_experience
-
Employee's work experience
Array of objects
company_id
Raw
Workplace (company) identifier in our database
Integer
date_from
Cleaned
Employment start date
String (date)
date_from_year
Cleaned
Employment start year
Integer
date_from_month
Cleaned
Employment start month
Integer
date_to
Cleaned
Employment end date
String (date)
date_to_year
Cleaned
Employment end year
Integer
date_to_month
Cleaned
Employment end month
Integer
company_url
Raw
Employee's workplace URL on professional network
String
company_name
Raw
Employer company
title
Raw
Job title
String
department
Enriched
Department the employee works in
String
management_level
Enriched
Employee's management level
String
description
Cleaned
Job description
String
order_in_profile
Raw
Record order as seen on the employee's profile
Integer
duration
Enriched
Employment duration
String (date)
duration_months
Cleaned
Employment duration in months
Integer
location
Cleaned
Job/workplace location
String
company_logo_url
–
URL pointing to the logo of the company/employer
String
Cleaning and enriching actions
date_from
Value is converted to the yyyy-mm-dd format.
date_from_year,
date_from_month
Year value extracted from
date_fromvalue;Value converted to integer.
date_to
Value is converted to the yyyy-mm-dd format.
date_to_year,
date_to_month
Year value extracted from
date_tovalue;Value converted to integer.
department
Enriched with our ML model from the title value.
management_level
Enriched with our ML model from the member_job_title value.
description
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value
None;Value is replaced to
Noneif the description is shorter than 3 characters;Text styling tags removed;
Multiple spaces are replaced with single ones.
duration
Derived from date_from and date_to values.
duration_months
Duration converted in numerical value.
location
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
member_department
Enriched
Departments derived from the member_job_title
String
member_management_level
Enriched
Management levels identified from the member_job_title
String
is_working
Enriched
Represents if the employee is currently working
0 – the employee is currently not working
1 – the employee is currently working
Integer
Enriching actions
member_department
Enriched with our ML model from the member_job_title value.
member_subdepartment
Enriched with our ML model from the member_job_title value.
member_management_level
Enriched with our ML model from the member_job_title value.
is_working
Based on date_to and date_from values of employee experience.
Education
member_education
Employee's education
Array of objects
major
Cleaned
Field of study
String
title
Cleaned
Educational institution
String
date_to
Cleaned
Graduation date
String
date_from
Cleaned
Enrolment date
String
institution_url
Cleaned
Institution's profile URL
String
institution_logo_url
–
URL pointing to the logo of the educational institution (university, school, training provider)
String
description
Cleaned
Education description
String
activities_and_societies
Cleaned
Details about activities and societies
String
Cleaning actions
title
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value
None;Values are capitalized.
major
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
date_from
Value is converted to the yyyy format.
date_to
Value is converted to the yyyy format.
institution_url
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
description
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value
None;Text styling tags are removed;
Multiple spaces are replaced with single ones.
activities_and_societies
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value
None;Text styling tags are removed;
Multiple spaces are replaced with single ones.
Hidden collections
is_hidden
Marks if the employee profile has a hidden education/experience collection.
0 – education/experience information was available at the time of profile scraping.
1 – education/experience information was not available at the time of profile scraping
Number (integer)
Location
member_location_raw_address
Cleaned
Raw address of the employee's location
String
member_location_country
Cleaned
Country of the employee's location
String
member_location_regions
Cleaned
Geographical regions within the employee's country
String
member_location_city
Cleaned
Employee location city
String
member_location_state
Cleaned
Employee location state
String
member_location_country_iso_2
–
ISO 2-letter code of the location country
String
member_location_country_iso_3
–
ISO 3-letter code of the location country
String
Cleaning actions
location_raw_address
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value
None;Special trailed characters are trimmed;
Value is set to
Noneif it is shorter than three characters;The value of
member_location_countryis added at the end of the string.
location_country
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
Recommendations and connections
member_recommendations
Cleaned
Employee recommendations
Array of objects
recommendation
Cleaned
Recommendation text
String
referee_name
Raw
Referee's name
String
referee_url
Raw
Referee's profile URL
String
member_recommendations_count
Cleaned
Number of received recommendations
Integer
member_connections_count
Raw
Number of employee's connections
Integer
Cleaning actions
member_recommendations
Deleted rows are filtered out.
recommendation
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value
None;Value is set to
Noneif it is shorter than three characters;Text styling tags are removed;
Multiple spaces are replaced with single ones;
Empty recommendations are filtered out.
member_recommendations_count
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value
None;Nonevalues are replaced with0and made an integer.
Languages
member_languages
Employee's language knowledge
Array of objects
language
Cleaned
Language
String
proficiency
Cleaned
Language proficiency
String
order_in_profile
Raw
Record order in the section
Integer
Cleaning actions
language
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
proficiency
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
Certifications
member_certifications
Employee's certifications
Array of objects
title
Cleaned
Language
String
issuer
Cleaned
Language proficiency
String
credential_id
Cleaned
Record order in the section
String
certificate_url
Cleaned
Certificate URL
String
certificate_logo_url
–
URL pointing to the logo of the certification provider (AWS, Microsoft, Coursera, etc.)
String
date_from
Cleaned
Issue date
String
date_to
Cleaned
Expiration date
String
issuer_url
Cleaned
Issuer profile URL
String
order_in_profile
Raw
Section record order
Integer
date_from_year
Cleaned
Issue year
Integer
date_from_month
Cleaned
Issue month
Integer
date_to_year
Cleaned
Expiration year
Integer
date_to_month
Cleaned
Expiration month
Integer
Cleaning actions
title
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
issuer
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
date_from
Value is converted to the yyyy-mm-dd format.
date_to
Value is converted to the yyyy-mm-dd format.
issuer_url
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
date_from_year,
date_to_year
Year value from date is converted to an integer.
date_from_month,
date_to_month
Month value from date is converted to an integer.
Courses
member_courses
Attended courses
Array of objects
organizer
Cleaned
Course organizer
String
title
Cleaned
Course title
String
order_in_profile
Raw
Record order in the section
Integer
Cleaning actions
organizer
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
title
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
Awards
member_awards
Held awards
Array of objects
title
Cleaned
Award
String
issuer
Cleaned
Award issuer
String
description
Cleaned
Award description
String
date
Cleaned
Issue date
String
order_in_profile
Raw
Section record order
Integer
date_year
Cleaned
Issue year
Integer
date_month
Cleaned
Issue month
Integer
date_day
Cleaned
Issue day
Integer
Cleaning actions
title
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value
None;Values are capitalized.
issuer
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value None.
date
Value is converted to the yyyy-mm-dd format.
date_year
Year value from date is converted to an integer.
date_month
Month value from date is converted to an integer.
Activity
member_activity
Interaction with posts on professional network
Array of objects
activity_url
Raw
Post URL
String
title
Cleaned
Post title
String
action
Cleaned
Interaction type
String
order_in_profile
Raw
Section record order
Integer
Cleaning actions
title
Values ["None"; "Unknown"; "NaN"; "nan"; "na"; "null"; "Null"; "NULL"; "-"; "--"] are replaced with value
None;Text styling tags removed;
Multiple spaces are replaced with single ones.
Organizations
member_organizations
Memberships in organizations
Array of structs
organization
Organization title
String
position
Position in the organization
String
description
Description of the activity/experience in the organization
String
date_from
Membership start date
String
date_from_year
Membership start year
Integer
date_from_month
Membership start month
Integer
date_to
Membership end date
String
date_to_year
Membership end year
Integer
date_to_month
Membership end month
Integer
order_in_profile
The exact position of the organization in the profile
Integer
Patents
member_patents
Authored patents
Array of structs
title
Patent title
String
status
Patent status
String
inventors
Inventors of the patent
Array of structs
full_name
Full name of the inventor
String
profile_url
Profile URL
String
order_in_profile
Order in profile
Integer
date
Patent filing date
String
date_year
Filling year
Integer
date_month
Filling month
Integer
date_day
Filling day
Integer
patent_url
Patent URL
String
description
Patent description
String
patent_or_application_number
Patent or application number
String
order_in_profile
The exact position of the patent in the profile
Integer
Publications
member_publications
Memberships in organizations
Array of structs
title
Publication title
String
publisher
Publisher name
String
date
Publication release date
String
date_year
Release year
Integer
date_month
Release month
Integer
date_day
Release day
Integer
description
Publication description
String
authors
Authors of the publication
Array of structs
full_name
Full name of the author
String
profile_url
Profile URL
String
order_in_profile
Order in the profile
Integer
publication_url
Publication website URL
String
order_in_profile
The exact position of the publication in the profile
Integer
Last updated
Was this helpful?