FAQ
Refresh rate
Available formats
Delivery frequency
Ongoing*
JSONL, CSV, and Parquet
Daily, monthly, and quarterly
📌 * Employee profiles are constantly being scraped from a queue of profiles, prioritizing the ones with recent changes. Some profiles are no longer accessible. As such, a complete data refresh is not feasible.
We send the professional network data using the following methods:
Method | Description |
---|---|
Links | We provide you with the link and login credentials for you to retrieve the data |
Amazon S3 | Provide your storage credentials, and we will send the data to you |
Google Cloud | Provide your storage credentials, and we will send the data to you |
Microsoft Azure | Provide your storage credentials, and we will send the data to you |
We deliver data in locational datasets: Global (all countries), English-speaking countries, Europe, and the United States. However, you can always submit a custom request
The following example illustrates downloading a dataset using a download link and credentials provided by us.
- Download the gzipped JSON file using the provided link and credentials. Click on the file you want to download:
- Unzip the file by clicking on it:
- A JSON file will appear at the unzip location. Each file will have up to 10,000 employee profile records.
The following example illustrates downloading a dataset using a download link and credentials provided by us.
- Click on the link and download the csv.gz file:
- Unzip the file by clicking:
Each gzipped CSV file contains a table with specific data collection (e.g., a skills table) from employee profile records.
The gzipped file might contain several files, but they all belong to the same table (e.g., skills):
We can only offer general solutions since it depends on the tech stack you use or what you prefer using.
Ingesting a large dataset like a Professional Network: Employees can be efficiently managed using a combination of tools and technologies tailored to handle big data workloads.
Tool category | Tool example |
---|---|
Database systems | |
Data processing frameworks | Apache Spark Apache Hadoop |
Data ingestion tools | Apache NiFi Google BigQuery |
Data ETL (Extract, Transform, Load) tools | |
Data transformation |