May 2025
Clean Employee data and Clean Employee API
📦 New features
New root-level field added:
public_profile_id
(String): Publicly provided employee URN added to each employee record.
New collections introduced:
patents
(Array of Structs)publications
(Array of Structs)organizations
(Array of Structs)
🔧 Improvements
Experience data logic updates:
Hidden experiences that no longer aggregate are now correctly marked with
deleted=1
.Logic corrected to update
date_to
fields when newer experience records appear.Removed inferred company HQ locations from experience records where original location was empty, addressing mismatches.
Data cleaning enhancements:
HTML tags have been stripped from description fields across multiple entities for cleaner presentation and parsing:
experience.description
education.description
summary
awards.description
patents.description
publications.description
projects.description
organizations
Multi-source Company data and Multi-source Company API
📦 New feature: fields added
employees_count_inferred
(Integer): Estimated employee count based on inferred data.employees_count_inferred_by_month
(Array of structs): Historical inferred employee counts over a rolling three-year window.employees_count_inferred_by_month.employees_count_inferred
(Integer): Estimated employee count based on inferred data.date
(String): Date identifier.
🔧 Improvements
Key executive data quality:
Removed ~3.3M low-quality or stale profiles from the following collections:
key_executives
key_executive_arrivals
key_executive_departures
Rolling window standardization:
Implemented a consistent three-year rolling window for all *_by_month breakdowns:
active_job_postings_count_by_month
employees_count_by_month
employees_count_by_country_by_month
employees_count_breakdown_by_department_by_month
employees_count_breakdown_by_region_by_month
employees_count_breakdown_by_seniority_by_month
professional_network_followers_count_by_month
product_reviews_score_by_month
Elasticsearch schema updates (no impact on data dictionary):
Changed data types from Nested to Flattened for improved indexing and performance:
base_salary
total_salary
🐞 Bug fixes
Deduplication:
funding_rounds
: Fixed duplicate entries in arrays.
Field normalization:
Resolved issues with lowercased values in several breakdown fields, improving mapping accuracy:
employees_count_breakdown_by_department
employees_count_breakdown_by_department_by_month
employees_count_breakdown_by_seniority
employees_count_breakdown_by_seniority_by_month
employees_count_breakdown_by_region
employees_count_breakdown_by_region_by_month
employees_count_by_country
employees_count_by_country_by_month
Search results
New feature: new API query parameter
items_per_page={int}
: Allows clients to specify the number of results returned per page in search endpoints (maximum remains 1,000). Enables better control over paginated responses starting from May 21, 2025.
curl -X 'POST' \
'https://api.coresignal.com/cdapi/v2/employee_base/search/es_dsl?items_per_page=10' \
-H 'accept: application/json' \
-H 'apikey: {API Key}' \
-H 'Content-Type: application/json' \
-d '{
"query": {
"bool": {
"should": [
{
"query_string": {
"query": "John Doe",
"default_field": "full_name",
"default_operator": "and"
}
}
]
}
}
}'
Last updated
Was this helpful?