CrunchBase is committed to providing the startup community an open, up-to-date, and accurate database of companies, venture fundings, and entrepreneurs. In the spirit of this commitment, we’ve decided to publish the raw data behind articles like NYC Angels Grab Market Share and Mining the Series A Crunch. Of course the data has always been available through the CrunchBase API, but not everyone is up for REST APIs. So today we’re going to try something different – an Excel Spreadsheet which contains a significant portion of our dataset.
By releasing CrunchBase data in a simple, easy to consume format, we hope that the community will produce and share valuable insights on technology startups. In fact, if you share those insights with us we’ll do our best to publish them for everyone to see. Please let us know what you discover.
Is the CrunchBase dataset 100% accurate? No. Does the dataset have gaps? Yes. Does the dataset contain some duplicate records? Yes. That said, we believe the best way to improve CrunchBase is through the community.
When we published Mining the Series A Crunch, not only did thousands of people download the dataset, but many of you immediately updated inaccurate profiles – remember CrunchBase is a database that anyone can edit. By releasing more data and building partnerships, some of which you will hear about this week at Disrupt NY 2013, we believe we can create an amazing and valuable resource for the technology community.
Here’s a quick FAQ and if you have questions, drop us a note at firstname.lastname@example.org. More importantly, if you find errors – please update the appropriate CrunchBase profiles. Together we can make this dataset awesome!
What are we publishing today? You can download an Excel Spreadsheet which contains the following data from CrunchBase:
- Companies headquartered in the US which has reported raising money
- Investors (individual and institutional) invested in the aforementioned companies
- Funding rounds in the aforementioned companies
Will this data be updated on a regular basis? Maybe. Before committing to anything, we want to hear your feedback. Take a look at the dataset and if you have ideas on how to make it better, how often to publish it, or any other thoughts then write to us at email@example.com.
Does this data match what’s on CrunchBase? Mostly… The data was pulled from CrunchBase on April 28, 2013 and contains a subset of the database as described above. Remember that this dataset is only as good as CrunchBase.
Is there a data-dictionary? No. We think it’s pretty self explanatory. Note that we tried to tie each company to a region (like NYC, SF Bay Area, etc.) and you will find columns aptly named “region”. This mapping may not be entirely accurate given the large number of misspelled city names in our system – something we need to fix.
What can I do with the data? Amazing things! Check out NYC Angels Grab Market Share and then open the spreadsheet to see how we got those graphs.
Where can I download the data? Just click here…