Company Data
Clean Company Data

Dictionary: Clean Company Data

28min
request access to our full documentation this is a simplified version of our documentation if you want to access data samples learn more about the cleaning and enrichment actions explore the complete list of data sources we offer clean company data provides high quality, structured business data ready for immediate use our data is meticulously cleaned and enriched, allowing organizations to streamline their workflows and confidently make data driven decisions by leveraging clean company data, businesses can reduce engineering overhead, gain access to additional insights, and work with optimized data formats for improved efficiency with multiple retrieval options—including flat file downloads in jsonl, parquet, and csv formats, as well as api access—our solution adapts to your needs, ensuring seamless integration into your existing data infrastructure clean company data is derived from our base company data docid\ slz0ymwsljqbotal87yzc the data points are separated into collections to visualize the data better the data provided in the samples is strictly intended for illustrative purposes, allowing you to understand its appearance and format better dictionary clean company data docid\ vawrxwyuz fyrwryj61ek dictionary clean company data docid\ vawrxwyuz fyrwryj61ek dictionary clean company data docid\ vawrxwyuz fyrwryj61ek dictionary clean company data docid\ vawrxwyuz fyrwryj61ek dictionary clean company data docid\ vawrxwyuz fyrwryj61ek dictionary clean company data docid\ vawrxwyuz fyrwryj61ek dictionary clean company data docid\ vawrxwyuz fyrwryj61ek dictionary clean company data docid\ vawrxwyuz fyrwryj61ek dictionary clean company data docid\ vawrxwyuz fyrwryj61ek dictionary clean company data docid\ vawrxwyuz fyrwryj61ek dictionary clean company data docid\ vawrxwyuz fyrwryj61ek m etadata data point processing description data type company last updated cleaned record update date s tring (date) company created at cleaned record creation date s tring (date) professional network source id raw record identification key assigned by professional network s tring data sample meta data "company created at" "2023 12 06", "company last updated" "2024 12 06", "professional network source id" "60191", cleaning actions data point cleaning action company last updated value is converted to the yyyy mm dd format company created at value is converted to the yyyy mm dd format identifiers data point processing description data type company id raw company id in our database n umber (integer) company hash raw company profile url processed by the md5 algorithm s tring company canonical shorthand name hash raw canonical shorthand name processed by the md5 algorithm s tring company name cleaned company name s tring company logo cleaned base64 encoded jpeg image of the company's logo s tring company ticker cleaned company's stock ticker s tring company exchange cleaned company's stock exchange string data sample clean company data "company id" 7811468, "company hash" "8ef8d364df382df483f47fe3e56dc4cd", "company canonical shorthand name hash" "8631ca96b6f656040bf3326deeb38df6", "company name" "example company", "company logo" "/9j/4aaqskzjrgabaqaaaqabaad/2wbdaamcagmcagmdawmeawmebqgfbqqebqohbwyidaomdaskcwsndhiqdq4rdgslebyqermufruvda8xgbyugbiufrt/2wbdaqmebauebqkfbqkudqsnfbqufbqufbqufbqufbqufbqufbqufbqufbqufbqufbqufbqufbqufbqufbqufbqufbt/waarcaajacmdasiaahebaxeb/8qahwaaaqubaqebaqeaaaaaaaaaaaecawqfbgcicqol/8qatraaagedawieawufbaqaaaf9aqidaaqrbrihmuege1fhbyjxfdkbkaeii0kxwrvs0fakm2jyggkkfhcygroljicokso0nty3odk6q0rfrkdisuptvfvwv1hzwmnkzwznaglqc3r1dnd4exqdhiwgh4ijipktljwwl5izmqkjpkwmp6ipqrkztlw2t7i5usldxmxgx8jjytlt1nxw19jz2uhi4+tl5ufo6erx8vp09fb3+pn6/8qahweaawebaqebaqebaqaaaaaaaaecawqfbgcicqol/8qatreaagecbaqdbacfbaqaaqj3aaecaxeebsexbhjbuqdhcrmimoeifekrobhbcsmzuvavynlrchyknoel8rcygromjygpkju2nzg5okneruzhselku1rvvldywvpjzgvmz2hpann0dxz3ehl6gooehyahiimkkpoulzaxmjmaoqokpaanqkmqsro0tba3ulm6wspexcbhymnk0tpu1dbx2nna4upk5ebn6onq8vp09fb3+pn6/9oadambaairaxeapwd9u6k+k7p9rfx1cffg7smj00anhrbsltbbhd5qm8v/afm7o7hofxtivc9n/ahsllrr2612yui3t/efzosgxj83zgjdor7cgj5bybnkhhhttqyotttdxv2pr8dwrmeauhoklzjp3xd6/wbdlnsnfex+ep2i/cxjjxna6dpbxr6jcpngqsqbqpiunicd3ymp4zncmu/r/wbohw3rj3s21vqwlc7sz26gmd8cyhz5+9cr9cd/ssxqqp2cwenpkcftk4zoytst26ntl72n9zptak5tw78qypfglr6jyatqr2zysxayw6o26oro3bbbpdiwornwknzndpd1acncas1o0fl3wn8tfcbwhqhie68ez6by+jlfxjdzwzxsejzigkyhauho24jj39k0vgvn4g+jpg3xhq/hko1nu3jg/njkvoqfpthgaozeohbevz0gsd3r6ivvb2g6ndpc3ei6ddxd/emmti3dvqsutwjy2ftptrhbwlvfa28ywkukbex6accu+eljk7sd3bd3xyptczxhqr06kqdotqt5b80lkmvg+ky8qdn5s+cbp4vepro+iu49h0w2kjdqrarbqsm5/m2nsbvjg6od/s88yjzqhwq8u/2vz/yni0z7x5uinivetzgmqxpioabxkocee9epbj6yorl63o97i8nz5illmsx3p/m838fav4x0fw1a2dzbbqzxtjvx7qgytix3ywpabznhbppnfekuvzod3eypjnihum5uku9f61ciiiszkciiigaooooa/9k=", "company ticker" "exmp", "company exchange" "nyse", cleaning and enriching actions data point cleaning/enriching action company name values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value none company logo image is resized to 50x50px company ticker values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value none firmographics data point processing description data type company industry cleaned company's industry s tring company type cleaned company type s tring company founded cleaned company's founding year s tring company size range cleaned company size range s tring company size employees count enriched the number of employees working in the company n umber (integer) company followers cleaned the number of company followers n umber (integer) company description cleaned company description s tring company specialities raw company specialties s tring metadata title enriched company title parsed from additional sources s tring metadata description enriched company description parsed from additional sources s tring company enriched summary enriched llm enriched company summary s tring company enriched category enriched company category assigned with llm s tring company enriched keywords enriched llm enriched company keywords a rray of strings company enriched b2b enriched marks if the company offers b2b products/services enriched with the help of llm 1 b2b company 0 not b2b company b oolean data sample clean company data "company type" "partnership", "company founded" "2010", "company followers" 0, "company size range" "1 10 employees", "company size employees count" 2, "company industry" "advertising services", "company description" "we help smes grow their businesses through effective online marketing strategies ", "company specialities" "email marketing, web sites, search engine optimisation, inbound marketing, social media marketing", "company enriched summary" "company1 is a premier web design and digital marketing agency based in london, uk specializing in custom, responsive websites, they provide professional design services, training, easy content management, and ongoing support ", "company enriched keywords" \[ "website design", "digital marketing", "professional", "custom responsive websites", "training" ], "company enriched b2b" 1 0, "company enriched category" "web design", "metadata title" "marketing, london,cost effective web design", "metadata description" null cleaning and enriching actions data point cleaning/enriching action company industry values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value none company type values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value none company founded values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value none ; values are replaced with none if the year is not between 500 and the current year company followers values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value 0 ; every value is converted to an integer company size range some inconsistencies are fixed with overlapping values "1 employee" "myself only"; "2 10 employees" "1 10 employees"; "501 1,000 employees" "501 1000 employees"; "1,001 5,000 employees" "1001 5000 employees" company size employees count we calcualte the number of employees working in the company using if company size employees count does not provide enough information company industry values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value none company description values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value none ; value is replaced to none if the description is shorter than 3 characters; text styling tags removed; multiple spaces are replaced with single ones product and services overview data point processing description data type pricing available enriched marks if the company service pricing is available online b oolean free trial available enriched marks if the company offers a free trial of their services b oolean demo available enriched marks if the company offers a demo b oolean is downloadable enriched marks if the company offers a downloadable file/service b oolean mobile apps exist enriched marks if the company has mobile apps for their service b oolean online reviews exist enriched marks if the company has any online reviews b oolean api docs exist enriched marks if the company has api docs published b oolean data sample clean company data "pricing available" true, "free trial available" false, "demo available" false, "is downloadable" false, "mobile apps exist" false, "online reviews exist" false, "api docs exist" false, e nriching actions data point e nriching action pricing available free trial available demo available is downloadable mobile apps exist online reviews exist api docs exist information taken from the official company website contact information data point processing description data type company phone numbers enriched publicly available company phone number a rray of strings company emails enriched publicly available company email address array of strings data sample contact information "company phone numbers" \[ "0000 000 000" ], "company emails" \[ "info\@company123 com" ], enriching actions data point enriching action company phone numbers company emails information taken from the official company website social media and websites data point processing description data type company websites main original raw company website s tring company websites main enriched cleaned and resolved website url s tring company websites facebook enriched facebook profile url s tring company websites twitter enriched twitter profile url s tring company websites professional network raw company professional network profile url s tring company websites professional network canonical raw canonical professional netwok profile url string company social discord urls enriched discord channel url array of strings company social facebook urls enriched facebook profile url array of strings company social instagram urls enriched instagram profile url array of strings company social professional network urls enriched company professional network profile url array of strings company social pinterest urls enriched pinterest profile url array of strings company social tiktok urls enriched tiktok profile url array of strings company social twitter urls enriched twitter profile url array of strings company social x urls enriched x profile url array of strings company social youtube urls enriched youtube channel/profile url array of strings company social github urls enriched github page/profile url array of strings company social reddit urls enriched reddit profile url array of strings data sample clean company data "company websites main original" "http //www example company com ", "company websites main" "https //example company com ", "company websites facebook" "https //www facebook com/example company", "company websites twitter" "https //www twitter com/example company", "company websites professional network" "https //www professional network com/company/example company international limited", "company websites professional network canonical" "https //www professional network com/company/example company international limited", company social links "company social discord urls" \[ "https //discord gg/example company" ], "company social facebook urls" \[ "https //www facebook com/example company" ], "company social instagram urls" \[ "https //www instagram com/example company" ], "company social professional network urls" \[ "https //www professional network com/company/example company" ], "company social pinterest urls" \[ "https //www pinterest com/example company" ], "company social tiktok urls" \[ "https //www tiktok com/@example company" ], "company social twitter urls" \[ "https //twitter com/example company" ], "company social x urls" \[ "https //www example company x com" ], "company social youtube urls" \[ "https //www youtube com/c/example company" ], "company social github urls" \[ "https //github com/example company" ], "company social reddit urls" \[ "https //www reddit com/user/example company" ] cleaning and enriching actions data point cleaning/enriching action company websites main every company website main original url is resolved; each url we collect is parsed, parameters are removed and added to the company websites main column ; expired domains are removed; additional enrichment actions are completed company websites twitter if \<domain> (from company websites main ) == twitter , we move the url value to company websites twitter company websites facebook if \<domain> (from company websites main ) == facebook , we move the url value to company websites facebook company websites linkedin if \<domain> (from company websites main ) == linkedin , we move the url value to company websites linkedin company social discord urls company social facebook urls company social instagram urls company social linkedin urls company social pinterest urls company social tiktok urls company social twitter urls company social x urls company social youtube urls company social github urls company social reddit urls urls taken from the official company website location data point processing description data type company location hq country cleaned headquarters country s tring company location hq raw address cleaned detailed company location s tring company location hq regions enriched geographical region(s) the company is associated with based on the company location hq country value s tring company locations full raw full company location information array of objects location address raw company hq location string is primary raw marks if the listed location is the primary boolean data sample locations "company location hq raw address" "los angeles, ca, united states", "company location hq country" "united states", "company location hq regions" "\[northern america, northern america, amer]", full location "company locations full" \[ { "location address" "416 bedonwell road abbeywood; london, se2 0se, gb", "is primary" false } ] cleaning actions data point cleaning action location hq country values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value none location hq raw address values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value none ; special trailing characters trimmed; value company location hq country added to the end of the string (separated by a comma) funding information data point processing description data type company funding rounds funding round details array of objects last round investors count cleaned the n umber of investors that participated in the last funding round n umber (integer) total rounds count cleaned total number of funding rounds n umber (integer) last round type cleaned last funding round type s tring last round date cleaned last funding round date s tring last round money raised cleaned total funds raised number (integer) financial website url raw financial website url of the last funding round s tring data sample clean company data "company funding rounds" \[ { "last round investors count" 5, "total rounds count" 3, "last round type" "series a", "last round date" "2020 11 10", "last round money raised" 15600000, "financial website url" "https //www financial website com/funding round/example company series a f1687fe3" } ] } cleaning actions data point cleaning action location hq country values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value none location hq raw address values \["none"; "unknown"; "nan"; "nan"; "na"; "null"; "null"; "null"; " "; " "] are replaced with value none ; special trailing characters trimmed; value company location hq country added to the end of the string (separated by a comma) technologies data point processing description data type company technologies enriched technologies used by the company array of structs technology enriched technology name string first verified at enriched date this technology was first assigned to the company date format yyyy mm dd string (date) last verified at enriched date this technology was last assigned to the company date format yyyy mm dd string (date) data sample technologies "company technologies" \[ { "technology" "react", "first verified at" "2022 03 15", "last verified at" "2025 02 15" } ] enriching actions data point enriching action company technologies enriched by our ml model from multiple sources supporting fields data point processing description data type expired domain enriched indicates that the company websites main original url redirects to a domain dealer b oolean unique subdomain enriched indicates that only the record company owns the subdomain b oolean unique domain enriched indicates that only this company has the right to have this unique domain, e g , company websites main https //ibm com b oolean unique website enriched indicates that only this company has a unique website but not necessarily a unique domain, e g , company websites main https //ibm com/generation b oolean data sample clean company data "expired domain" 0, "unique domain" 1, "unique subdomain" 1, "unique website" 0, company updates data point processing description data type company updates company posts and related details array of objects urn raw string based identifier s tring followers raw number of followers s tring date raw post p ublish date (e g , 1 month ago) s tring description raw published text note may contain control characters string reactions count raw number of reactions on the post integer comments count raw number of comments on the post integer reshared post author raw reshared post author string reshared post author url raw author's profile url string reshared post author headline raw author's headline s tring reshared post description raw reshared post text s tring reshared post followers raw the number of followers of the reshared post author integer reshared post date raw date the reshared post was published (e g , 1 month ago) s tring data sample company updates table "company updates collection" \[ { "urn" "urn\ pn\ activity 6991335602751201281", "followers" 1371, "date" "1mo", "description" "example description", "reactions count" 22, "comments count" 2, "reshared post author" "john doe", "reshared post author url" "https //www professional network com/in/john doe", "reshared post author headline" "co founder at example company, tedx & keynote speaker", "reshared post description" "example description", "reshared post followers" 45, "reshared post date" "1mo" } ]