May 2025

Clean Employee data and Clean Employee API

📦 New features

New root-level field added:

public_profile_id (String): Publicly provided employee URN added to each employee record.

New collections introduced:

patents (Array of Structs)
publications (Array of Structs)
organizations (Array of Structs)

🔧 Improvements

Experience data logic updates:

Hidden experiences that no longer aggregate are now correctly marked with deleted=1.
Logic corrected to update date_to fields when newer experience records appear.
Removed inferred company HQ locations from experience records where original location was empty, addressing mismatches.

Data cleaning enhancements:

HTML tags have been stripped from description fields across multiple entities for cleaner presentation and parsing:

experience.description
education.description
summary
awards.description
patents.description
publications.description
projects.description
organizations

Multi-source Company data and Multi-source Company API

📦 New feature: fields added

employees_count_inferred (Integer): Estimated employee count based on inferred data.
employees_count_inferred_by_month (Array of structs): Historical inferred employee counts over a rolling three-year window.
employees_count_inferred_by_month.employees_count_inferred (Integer): Estimated employee count based on inferred data.
date (String): Date identifier.

🔧 Improvements

Key executive data quality:

Removed ~3.3M low-quality or stale profiles from the following collections:

key_executives
key_executive_arrivals
key_executive_departures

Rolling window standardization:

Implemented a consistent three-year rolling window for all *_by_month breakdowns:

active_job_postings_count_by_month
employees_count_by_month
employees_count_by_country_by_month
employees_count_breakdown_by_department_by_month
employees_count_breakdown_by_region_by_month
employees_count_breakdown_by_seniority_by_month
professional_network_followers_count_by_month
product_reviews_score_by_month

Elasticsearch schema updates (no impact on data dictionary):

Changed data types from Nested to Flattened for improved indexing and performance:

base_salary
total_salary

🐞 Bug fixes

Deduplication:

funding_rounds: Fixed duplicate entries in arrays.

Field normalization:

Resolved issues with lowercased values in several breakdown fields, improving mapping accuracy:

employees_count_breakdown_by_department
employees_count_breakdown_by_department_by_month
employees_count_breakdown_by_seniority
employees_count_breakdown_by_seniority_by_month
employees_count_breakdown_by_region
employees_count_breakdown_by_region_by_month
employees_count_by_country
employees_count_by_country_by_month

Search results

New feature: new API query parameter

items_per_page={int}: Allows clients to specify the number of results returned per page in search endpoints (maximum remains 1,000). Enables better control over paginated responses starting from May 21, 2025.

Sample

curl -X 'POST' \
'https://api.coresignal.com/cdapi/v2/employee_base/search/es_dsl?items_per_page=10' \
  -H 'accept: application/json' \
  -H 'apikey: {API Key}' \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
        "bool": {
            "should": [
                {
                    "query_string": {
                        "query": "John Doe",
                        "default_field": "full_name",
                        "default_operator": "and"
                    }
                }
            ]
        }
    }
}'

PreviousJune 2025 NextApril 2025

Last updated 5 months ago

Was this helpful?