# May 2025

## Clean Employee data and Clean Employee API

### 📦 New features

#### **New root-level field added**:

* `public_profile_id` (String): Publicly provided employee URN added to each employee record.

#### **New collections introduced**:

* `patents` (Array of Structs)
* `publications` (Array of Structs)
* `organizations` (Array of Structs)

### 🔧 Improvements

#### **Experience data logic updates**:

* Hidden experiences that no longer aggregate are now correctly marked with `deleted=1`.
* Logic corrected to update `date_to` fields when newer experience records appear.
* Removed inferred company HQ locations from experience records where original location was empty, addressing mismatches.

#### **Data cleaning enhancements**:

HTML tags have been stripped from description fields across multiple entities for cleaner presentation and parsing:

* `experience.description`
* `education.description`
* `summary`
* `awards.description`
* `patents.description`
* `publications.description`
* `projects.description`
* `organizations`

## Multi-source Company data and Multi-source Company API

### 📦 New feature: **fields added**

* `employees_count_inferred` (Integer): Estimated employee count based on inferred data.
* `employees_count_inferred_by_month` (Array of structs): Historical inferred employee counts over a rolling three-year window.
* `employees_count_inferred_by_month.employees_count_inferred` (Integer): Estimated employee count based on inferred data.
* `date` (String): Date identifier.

### 🔧 Improvements

#### **Key executive data quality**:

Removed \~3.3M low-quality or stale profiles from the following collections:

* `key_executives`
* `key_executive_arrivals`
* `key_executive_departures`

#### **Rolling window standardization**:

Implemented a consistent three-year rolling window for all \*\_by\_month breakdowns:

* `active_job_postings_count_by_month`
* `employees_count_by_month`
* `employees_count_by_country_by_month`
* `employees_count_breakdown_by_department_by_month`
* `employees_count_breakdown_by_region_by_month`
* `employees_count_breakdown_by_seniority_by_month`
* `professional_network_followers_count_by_month`
* `product_reviews_score_by_month`

#### **Elasticsearch schema updates** *(no impact on data dictionary)*:

Changed data types from **Nested** to **Flattened** for improved indexing and performance:

* `base_salary`
* `total_salary`

### 🐞 Bug fixes

#### **Deduplication**:

`funding_rounds`: Fixed duplicate entries in arrays.

#### **Field normalization**:

Resolved issues with lowercased values in several breakdown fields, improving mapping accuracy:

* `employees_count_breakdown_by_department`
* `employees_count_breakdown_by_department_by_month`
* `employees_count_breakdown_by_seniority`
* `employees_count_breakdown_by_seniority_by_month`
* `employees_count_breakdown_by_region`
* `employees_count_breakdown_by_region_by_month`
* `employees_count_by_country`
* `employees_count_by_country_by_month`

## Search results

### New feature: n**ew API query parameter**

`items_per_page={int}`: Allows clients to specify the number of results returned per page in search endpoints (maximum remains 1,000). Enables better control over paginated responses starting from May 21, 2025.

{% code title="Sample" %}

```json
curl -X 'POST' \
'https://api.coresignal.com/cdapi/v2/employee_base/search/es_dsl?items_per_page=10' \
  -H 'accept: application/json' \
  -H 'apikey: {API Key}' \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
        "bool": {
            "should": [
                {
                    "query_string": {
                        "query": "John Doe",
                        "default_field": "full_name",
                        "default_operator": "and"
                    }
                }
            ]
        }
    }
}'
```

{% endcode %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.coresignal.com/release-notes/may-2025.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
