.
Open Data Sources Database
Database: Data Sources
Know another great open data source? Please add it so that others can benefit from it! You can add to the database here.Fliter by category:
Name | Link | Category |
---|---|---|
U.S. Department of Agriculture's PLANTS Database | http://www.plants.usda.gov/dl_all.html | Agriculture |
U.S. Department of Agriculture's Nutrient Database | https://www.ars.usda.gov/northeast-area/beltsville-md/beltsville-human-nutrition-research-center/nutrient-data-laboratory/docs/sr28-download-files/ | Agriculture |
1000 Genomes | http://www.1000genomes.org/data | Biology |
American Gut (Microbiome Project) | https://github.com/biocore/American-Gut | Biology |
Broad Bioimage Benchmark Collection (BBBC) | https://www.broadinstitute.org/bbbc | Biology |
Broad Cancer Cell Line Encyclopedia (CCLE) | http://www.broadinstitute.org/ccle/home | Biology |
Cell Image Library | http://www.cellimagelibrary.org/ | Biology |
Complete Genomics Public Data | http://www.completegenomics.com/public-data/69-genomes/ | Biology |
EBI ArrayExpress | http://www.ebi.ac.uk/arrayexpress/ | Biology |
EBI Protein Data Bank in Europe | http://www.ebi.ac.uk/pdbe/emdb/index.html/ | Biology |
Electron Microscopy Pilot Image Archive (EMPIAR) | http://www.ebi.ac.uk/pdbe/emdb/empiar/ | Biology |
ENCODE project | https://www.encodeproject.org/ | Biology |
Ensembl Genomes | http://ensemblgenomes.org/info/genomes | Biology |
Gene Expression Omnibus (GEO) | http://www.ncbi.nlm.nih.gov/geo/ | Biology |
Gene Ontology (GO) | http://geneontology.org/page/download-annotations | Biology |
Global Biotic Interactions (GloBI) | https://github.com/jhpoelen/eol-globi-data/wiki#accessing-species-interaction-data | Biology |
Harvard Medical School (HMS) LINCS Project | http://lincs.hms.harvard.edu/ | Biology |
Human Genome Diversity Project | http://www.hagsc.org/hgdp/files.html | Biology |
Human Microbiome Project (HMP) | http://www.hmpdacc.org/reference_genomes/reference_genomes.php | Biology |
ICOS PSP Benchmark | http://ico2s.org/datasets/psp_benchmark.html | Biology |
International HapMap Project | http://hapmap.ncbi.nlm.nih.gov/downloads/index.html.en | Biology |
Journal of Cell Biology DataViewer | http://jcb-dataviewer.rupress.org/ | Biology |
MIT Cancer Genomics Data | http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi | Biology |
NCBI Proteins | http://www.ncbi.nlm.nih.gov/guide/proteins/#databases | Biology |
NCBI Taxonomy | http://www.ncbi.nlm.nih.gov/taxonomy | Biology |
NCI Genomic Data Commons | https://gdc-portal.nci.nih.gov/ | Biology |
NIH Microarray data or FTP (see FTP link on RAW) | http://bit.do/VVW6 | Biology |
OpenSNP genotypes data | https://opensnp.org/ | Biology |
Pathguid - Protein-Protein Interactions Catalog | http://www.pathguide.org/ | Biology |
Protein Data Bank | http://www.rcsb.org/ | Biology |
Psychiatric Genomics Consortium | https://www.med.unc.edu/pgc/downloads | Biology |
PubChem Project | https://pubchem.ncbi.nlm.nih.gov/ | Biology |
PubGene (now Coremine Medical) | http://www.pubgene.org/ | Biology |
Sanger Catalogue of Somatic Mutations in Cancer (COSMIC) | http://cancer.sanger.ac.uk/cosmic | Biology |
Sanger Genomics of Drug Sensitivity in Cancer Project (GDSC) | http://www.cancerrxgene.org/ | Biology |
Sequence Read Archive(SRA) | http://www.ncbi.nlm.nih.gov/Traces/sra/ | Biology |
Stanford Microarray Data | http://smd.stanford.edu/ | Biology |
Stowers Institute Original Data Repository | http://www.stowers.org/research/publications/odr | Biology |
Systems Science of Biological Dynamics (SSBD) Database | http://ssbd.qbic.riken.jp/ | Biology |
The Cancer Genome Atlas (TCGA), available via Broad GDAC | https://gdac.broadinstitute.org/ | Biology |
The Catalogue of Life | http://www.catalogueoflife.org/content/annual-checklist-archive | Biology |
The Personal Genome Project or PGP | http://www.personalgenomes.org/ | Biology |
UCSC Public Data | http://hgdownload.soe.ucsc.edu/downloads.html | Biology |
UniGene | http://www.ncbi.nlm.nih.gov/unigene | Biology |
Universal Protein Resource (UnitProt) | http://www.uniprot.org/downloads | Biology |
Actuaries Climate Index | http://actuariesclimateindex.org/data/ | Climate/Weather |
Australian Weather | http://www.bom.gov.au/climate/dwo/ | Climate/Weather |
Aviation Weather Center - Consistent, timely and accurate weather information for the world airspace system | https://aviationweather.gov/adds/dataserver | Climate/Weather |
Brazilian Weather - Historical data (In Portuguese) | http://sinda.crn2.inpe.br/PCD/SITE/novo/site/ | Climate/Weather |
Canadian Meteorological Centre | http://weather.gc.ca/grib/index_e.html | Climate/Weather |
Climate Data from UEA (updated monthly) | https://crudata.uea.ac.uk/cru/data/temperature/#datterandftp://ftp.cmdl.noaa.gov/ | Climate/Weather |
European Climate Assessment & Dataset | http://eca.knmi.nl/ | Climate/Weather |
Global Climate Data Since 1929 | http://en.tutiempo.net/climate | Climate/Weather |
NASA Global Imagery Browse Services | https://wiki.earthdata.nasa.gov/display/GIBS | Climate/Weather |
NOAA Bering Sea Climate | http://www.beringclimate.noaa.gov/ | Climate/Weather |
NOAA Climate Datasets | http://www.ncdc.noaa.gov/data-access/quick-links | Climate/Weather |
NOAA Realtime Weather Models | http://www.ncdc.noaa.gov/data-access/model-data/model-datasets/numerical-weather-prediction | Climate/Weather |
NOAA SURFRAD Meteorology and Radiation Datasets | https://www.esrl.noaa.gov/gmd/grad/stardata.html | Climate/Weather |
The World Bank Open Data Resources for Climate Change | http://data.worldbank.org/developers/climate-data-api | Climate/Weather |
UEA Climatic Research Unit | http://www.cru.uea.ac.uk/data | Climate/Weather |
WorldClim - Global Climate Data | http://www.worldclim.org/ | Climate/Weather |
WU Historical Weather Worldwide | https://www.wunderground.com/history/index.html | Climate/Weather |
AMiner Citation Network Dataset | http://aminer.org/citation | Complex Networks |
CrossRef DOI URLs | https://archive.org/details/doi-urls | Complex Networks |
DBLP Citation dataset | https://kdl.cs.umass.edu/display/public/DBLP | Complex Networks |
DIMACS Road Networks Collection | http://www.dis.uniroma1.it/challenge9/download.shtml | Complex Networks |
NBER Patent Citations | http://nber.org/patents/ | Complex Networks |
Network Repository with Interactive Exploratory Analysis Tools | http://networkrepository.com/ | Complex Networks |
NIST complex networks data collection | http://math.nist.gov/~RPozo/complex_datasets.html | Complex Networks |
Protein-protein interaction network | http://vlado.fmf.uni-lj.si/pub/networks/data/bio/Yeast/Yeast.htm | Complex Networks |
PyPI and Maven Dependency Network | https://ogirardot.wordpress.com/2013/01/31/sharing-pypimaven-dependency-data/ | Complex Networks |
Scopus Citation Database | https://www.elsevier.com/solutions/scopus | Complex Networks |
Small Network Data | http://www-personal.umich.edu/~mejn/netdata/ | Complex Networks |
Stanford GraphBase (Steven Skiena) | http://www3.cs.stonybrook.edu/~algorith/implement/graphbase/implement.shtml | Complex Networks |
Stanford Large Network Dataset Collection | http://snap.stanford.edu/data/ | Complex Networks |
Stanford Longitudinal Network Data Sources | http://stanford.edu/group/sonia/dataSources/index.html | Complex Networks |
The Koblenz Network Collection | http://konect.uni-koblenz.de/ | Complex Networks |
The Laboratory for Web Algorithmics (UNIMI) | http://law.di.unimi.it/datasets.php | Complex Networks |
The Nexus Network Repository | http://nexus.igraph.org/ | Complex Networks |
UCI Network Data Repository | https://networkdata.ics.uci.edu/resources.php | Complex Networks |
UFL sparse matrix collection | http://www.cise.ufl.edu/research/sparse/matrices/ | Complex Networks |
WSU Graph Database | http://www.eecs.wsu.edu/mgd/gdb.html | Complex Networks |
3.5B Web Pages from CommonCrawl 2012 | http://www.bigdatanews.com/profiles/blogs/big-data-set-3-5-billion-web-pages-made-available-for-all-of-us | Computer Networks |
53.5B Web clicks of 100K users in Indiana Univ. | http://cnets.indiana.edu/groups/nan/webtraffic/click-dataset/ | Computer Networks |
CAIDA Internet Datasets | http://www.caida.org/data/overview/ | Computer Networks |
ClueWeb09 - 1B web pages | http://lemurproject.org/clueweb09/ | Computer Networks |
ClueWeb12 - 733M web pages | http://lemurproject.org/clueweb12/ | Computer Networks |
CommonCrawl Web Data over 7 years | http://commoncrawl.org/the-data/get-started/ | Computer Networks |
CRAWDAD Wireless datasets from Dartmouth Univ. | https://crawdad.cs.dartmouth.edu/ | Computer Networks |
Criteo click-through data | http://labs.criteo.com/2015/03/criteo-releases-its-new-dataset/ | Computer Networks |
OONI: Open Observatory of Network Interference - Internet censorship data | https://ooni.torproject.org/data/ | Computer Networks |
Open Mobile Data by MobiPerf | https://console.developers.google.com/storage/openmobiledata_public/ | Computer Networks |
Rapid7 Sonar Internet Scans | https://sonar.labs.rapid7.com/ | Computer Networks |
UCSD Network Telescope, IPv4 /8 net | http://www.caida.org/projects/network_telescope/ | Computer Networks |
Bruteforce Database | https://github.com/duyetdev/bruteforce-database | Data Challenges |
Challenges in Machine Learning | http://www.chalearn.org/ | Data Challenges |
CrowdANALYTIX dataX | http://data.crowdanalytix.com/ | Data Challenges |
D4D Challenge of Orange | http://www.d4d.orange.com/en/home | Data Challenges |
DrivenData Competitions for Social Good | http://www.drivendata.org/ | Data Challenges |
ICWSM Data Challenge (since 2009) | http://icwsm.cs.umbc.edu/ | Data Challenges |
Kaggle Competition Data | https://www.kaggle.com/ | Data Challenges |
KDD Cup by Tencent 2012 | http://www.kddcup2012.org/ | Data Challenges |
Localytics Data Visualization Challenge | https://github.com/localytics/data-viz-challenge | Data Challenges |
Netflix Prize | http://netflixprize.com/leaderboard.html | Data Challenges |
Space Apps Challenge | https://2015.spaceappschallenge.org/ | Data Challenges |
Telecom Italia Big Data Challenge | https://dandelion.eu/datamine/open-big-data/ | Data Challenges |
TravisTorrent Dataset - MSR'2017 Mining Challenge | https://travistorrent.testroots.org/ | Data Challenges |
Yelp Dataset Challenge | http://www.yelp.com/dataset_challenge | Data Challenges |
AQUASTAT - Global water resources and uses | http://www.fao.org/nr/water/aquastat/data/query/index.html?lang=en | Earth Science |
BODC - marine data of ~22K vars | https://www.bodc.ac.uk/data/ | Earth Science |
Earth Models | http://www.earthmodels.org/ | Earth Science |
EOSDIS - NASA's earth observing system data | http://sedac.ciesin.columbia.edu/data/sets/browse | Earth Science |
Integrated Marine Observing System (IMOS) - roughly 30TB of ocean measurements or on S3 | https://imos.aodn.org.au/ | Earth Science |
Marinexplore - Open Oceanographic Data | http://marinexplore.org/ | Earth Science |
Smithsonian Institution Global Volcano and Eruption Database | http://volcano.si.edu/ | Earth Science |
USGS Earthquake Archives | http://earthquake.usgs.gov/earthquakes/search/ | Earth Science |
American Economic Association (AEA) | https://www.aeaweb.org/resources/data | Economics |
EconData from UMD | http://inforumweb.umd.edu/econdata/econdata.html | Economics |
Economic Freedom of the World Data | http://www.freetheworld.com/datasets_efw.html | Economics |
Historical MacroEconomc Statistics | http://www.historicalstatistics.org/ | Economics |
International Economics Database and various data tools | http://widukind.cepremap.org/ | Economics |
International Trade Statistics | https://github.com/Widukind | Economics |
Internet Product Code Database | http://www.econostatistics.co.za/ | Economics |
Joint External Debt Data Hub | http://www.upcdatabase.com/ | Economics |
Jon Haveman International Trade Data Links | http://www.jedh.org/ | Economics |
OpenCorporates Database of Companies in the World | https://opencorporates.com/ | Economics |
Our World in Data | http://ourworldindata.org/ | Economics |
SciencesPo World Trade Gravity Datasets | http://econ.sciences-po.fr/thierry-mayer/data | Economics |
The Atlas of Economic Complexity | http://atlas.cid.harvard.edu/ | Economics |
The Center for International Data | http://cid.econ.ucdavis.edu/ | Economics |
The Observatory of Economic Complexity | http://atlas.media.mit.edu/en/ | Economics |
UN Commodity Trade Statistics | http://comtrade.un.org/db/ | Economics |
UN Human Development Reports | http://hdr.undp.org/en | Economics |
College Scorecard Data | https://collegescorecard.ed.gov/data/ | Education |
Student Data from Free Code Camp | http://academictorrents.com/details/030b10dad0846b5aecc3905692890fb02404adbf | Education |
AMPds | http://ampds.org/ | Energy |
BLUEd | http://nilm.cmubi.org/ | Energy |
COMBED | http://combed.github.io/ | Energy |
Dataport | https://dataport.pecanstreet.org/ | Energy |
DRED | http://www.st.ewi.tudelft.nl/~akshay/dred/ | Energy |
ECO | http://www.vs.inf.ethz.ch/res/show.html?what=eco-data | Energy |
EIA | http://www.eia.gov/electricity/data/eia923/ | Energy |
HES - Household Electricity Study, UK | http://randd.defra.gov.uk/Default.aspx?Menu=Menu&Module=More&Location=None&ProjectID=17359&FromSearch=Y&Publisher=1&SearchText=EV0702&SortString=ProjectCode&SortOrder=Asc&Paging=10#Description | Energy |
HFED | http://hfed.github.io/ | Energy |
iAWE | http://iawe.github.io/ | Energy |
PLAID - the Plug Load Appliance Identification Dataset | http://plaidplug.com/ | Energy |
REDD | http://redd.csail.mit.edu/ | Energy |
Tracebase | https://www.tracebase.org/ | Energy |
UK-DALE - UK Domestic Appliance-Level Electricity | http://www.doc.ic.ac.uk/~dk3810/data/ | Energy |
WHITED | http://nilmworkshop.org/2016/proceedings/Poster_ID18.pdf | Energy |
CBOE Futures Exchange | Finance | |
Google Finance | http://cfe.cboe.com/Data/ | Finance |
Google Trends | https://www.google.com/finance | Finance |
NASDAQ | http://www.google.com/trends?q=google&ctab=0&geo=all&date=all&sort=0 | Finance |
NYSE Market Data (see FTP link on RAW) | https://data.nasdaq.com/ | Finance |
OANDA | http://www.oanda.com/ | Finance |
OSU Financial data | http://fisher.osu.edu/fin/fdf/osudata.htm | Finance |
Quandl | https://www.quandl.com/ | Finance |
St Louis Federal | https://research.stlouisfed.org/fred2/ | Finance |
Yahoo Finance | http://finance.yahoo.com/ | Finance |
ArcGIS Open Data portal | http://opendata.arcgis.com/ | GIS |
Cambridge, MA, US, GIS data on GitHub | http://cambridgegis.github.io/gisdata.html | GIS |
Factual Global Location Data | https://www.factual.com/ | GIS |
Geo Spatial Data from ASU | http://geodacenter.asu.edu/datalist/ | GIS |
Geo Wiki Project - Citizen-driven Environmental Monitoring | http://geo-wiki.org/ | GIS |
GeoFabrik - OSM data extracted to a variety of formats and areas | http://download.geofabrik.de/ | GIS |
GeoNames Worldwide | http://www.geonames.org/ | GIS |
Global Administrative Areas Database (GADM) | http://www.gadm.org/ | GIS |
Homeland Infrastructure Foundation-Level Data | https://hifld-dhs-gii.opendata.arcgis.com/ | GIS |
Landsat 8 on AWS | https://aws.amazon.com/public-data-sets/landsat/ | GIS |
List of all countries in all languages | https://github.com/umpirsky/country-list | GIS |
National Weather Service GIS Data Portal | http://www.nws.noaa.gov/gis/ | GIS |
Natural Earth - vectors and rasters of the world | http://www.naturalearthdata.com/ | GIS |
OpenAddresses | http://openaddresses.io/ | GIS |
OpenStreetMap (OSM) | http://wiki.openstreetmap.org/wiki/Downloading_data | GIS |
Pleiades - Gazetteer and graph of ancient places | http://pleiades.stoa.org/ | GIS |
Reverse Geocoder using OSM data & additional high-resolution data files | https://github.com/kno10/reversegeocode | GIS |
TIGER/Line - U.S. boundaries and roads | http://www.census.gov/geo/maps-data/data/tiger-line.html | GIS |
TwoFishes - Foursquare's coarse geocoder | https://github.com/foursquare/twofishes | GIS |
TZ Timezones shapfiles | http://efele.net/maps/tz/world/ | GIS |
UN Environmental Data | http://geodata.grid.unep.ch/ | GIS |
World boundaries from the U.S. Department of State | https://hiu.state.gov/data/data.aspx | GIS |
World countries in multiple formats | https://github.com/mledoze/countries | GIS |
A list of cities and countries contributed by community | https://github.com/caesar0301/awesome-public-datasets/blob/master/Government.rst | Government |
Open Data for Africa | http://opendataforafrica.org/ | Government |
OpenDataSoft's list of 1,600 open data | https://www.opendatasoft.com/a-comprehensive-list-of-all-open-data-portals-around-the-world/ | Government |
EHDP Large Health Data Sets | http://www.ehdp.com/vitalnet/datasets.htm | Healthcare |
Gapminder World demographic databases | http://www.gapminder.org/data/ | Healthcare |
GDC supports several cancer genome programs for CCG, TCGA, TARGET etc. | https://gdc.cancer.gov/ | Healthcare |
PhysioBank Databases - a large and growing archive of physiological data | https://www.physionet.org/physiobank/database/ | Healthcare |
Medicare Coverage Database (MCD), U.S. | https://www.cms.gov/medicare-coverage-database/ | Healthcare |
Medicare Data Engine of medicare.gov Data | https://data.medicare.gov/ | Healthcare |
Medicare Data File | http://go.cms.gov/19xxPN4 | Healthcare |
MeSH, the vocabulary thesaurus used for indexing articles for PubMed | https://www.nlm.nih.gov/mesh/filelist.html | Healthcare |
Number of Ebola Cases and Deaths in Affected Countries (2014) | https://data.hdx.rwlabs.org/dataset/ebola-cases-2014 | Healthcare |
Open-ODS (structure of the UK NHS) | http://www.openods.co.uk/ | Healthcare |
OpenPaymentsData, Healthcare financial relationship data | https://openpaymentsdata.cms.gov/ | Healthcare |
The Cancer Genome Atlas project (TCGA) (refer to GDC and BigQuery table) | https://portal.gdc.cancer.gov/ | Healthcare |
World Health Organization Global Health Observatory | http://www.who.int/gho/en/ | Healthcare |
10k US Adult Faces Database | http://wilmabainbridge.com/facememorability2.html | Image Processing |
2GB of Photos of Cats or Archive version | http://137.189.35.203/WebUI/CatDatabase/catData.html | Image Processing |
Adience Unfiltered faces for gender and age classification | http://www.openu.ac.il/home/hassner/Adience/data.html | Image Processing |
Affective Image Classification | http://www.imageemotion.org/ | Image Processing |
Animals with attributes | http://attributes.kyb.tuebingen.mpg.de/ | Image Processing |
Caltech Pedestrian Detection Benchmark | http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/ | Image Processing |
Chars74K dataset, Character Recognition in Natural Images (both English and Kannada are available) | http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/ | Image Processing |
Face Recognition Benchmark | http://www.face-rec.org/databases/ | Image Processing |
Flickr: 32 Class Brand Logos | http://www.multimedia-computing.de/flickrlogos/ | Image Processing |
GDXray: X-ray images for X-ray testing and Computer Vision | http://dmery.ing.puc.cl/index.php/material/gdxray/ | Image Processing |
ImageNet (in WordNet hierarchy) | http://www.image-net.org/ | Image Processing |
Indoor Scene Recognition | http://web.mit.edu/torralba/www/indoor.html | Image Processing |
International Affective Picture System, UFL | http://csea.phhp.ufl.edu/media/iapsmessage.html | Image Processing |
Massive Visual Memory Stimuli, MIT | http://cvcl.mit.edu/MM/stimuli.html | Image Processing |
MNIST database of handwritten digits, near 1 million examples | http://yann.lecun.com/exdb/mnist/ | Image Processing |
Several Shape-from-Silhouette Datasets | http://kaiwolf.no-ip.org/3d-model-repository.html | Image Processing |
Stanford Dogs Dataset | http://vision.stanford.edu/aditya86/ImageNetDogs/ | Image Processing |
SUN database, MIT | http://groups.csail.mit.edu/vision/SUN/hierarchy.html | Image Processing |
The Action Similarity Labeling (ASLAN) Challenge | http://www.openu.ac.il/home/hassner/data/ASLAN/ASLAN.html | Image Processing |
The Oxford-IIIT Pet Dataset | http://www.robots.ox.ac.uk/~vgg/data/pets/ | Image Processing |
Violent-Flows - Crowd Violence Non-violence Database and benchmark | http://www.openu.ac.il/home/hassner/data/violentflows/ | Image Processing |
Visual genome | http://visualgenome.org/api/v0/api_home.html | Image Processing |
YouTube Faces Database | http://www.cs.tau.ac.il/~wolf/ytfaces/ | Image Processing |
Context-aware data sets from five domains | https://github.com/irecsys/CARSKit/tree/master/context-aware_data_sets | Machine Learning |
Delve Datasets for classification and regression (Univ. of Toronto) | http://www.cs.toronto.edu/~delve/data/datasets.html | Machine Learning |
Discogs Monthly Data | http://data.discogs.com/ | Machine Learning |
eBay Online Auctions (2012) | http://www.modelingonlineauctions.com/datasets | Machine Learning |
IMDb Database | http://www.imdb.com/interfaces | Machine Learning |
Keel Repository for classification, regression and time series | http://sci2s.ugr.es/keel/datasets.php | Machine Learning |
Labeled Faces in the Wild (LFW) | http://vis-www.cs.umass.edu/lfw/ | Machine Learning |
Lending Club Loan Data | https://www.lendingclub.com/info/download-data.action | Machine Learning |
Machine Learning Data Set Repository | http://mldata.org/ | Machine Learning |
Free Music Archive | https://github.com/mdeff/fma | Machine Learning |
Million Song Dataset | http://labrosa.ee.columbia.edu/millionsong/ | Machine Learning |
More Song Datasets | http://labrosa.ee.columbia.edu/millionsong/pages/additional-datasets | Machine Learning |
MovieLens Data Sets | http://grouplens.org/datasets/movielens/ | Machine Learning |
New Yorker caption contest ratings | https://github.com/nextml/caption-contest-data | Machine Learning |
RDataMining - "R and Data Mining" ebook data | http://www.rdatamining.com/data | Machine Learning |
Registered Meteorites on Earth | http://publichealthintelligence.org/content/registered-meteorites-has-impacted-earth-visualized | Machine Learning |
Restaurants Health Score Data in San Francisco | http://missionlocal.org/san-francisco-restaurant-health-inspections/ | Machine Learning |
UCI Machine Learning Repository | http://archive.ics.uci.edu/ml/ | Machine Learning |
Yahoo! Ratings and Classification Data | http://webscope.sandbox.yahoo.com/catalog.php?datatype=r | Machine Learning |
Youtube 8m | https://research.google.com/youtube8m/download.html | Machine Learning |
Canada Science and Technology Museums Corporation's Open Data | http://techno-science.ca/en/data.php | Museums |
Cooper-Hewitt's Collection Database | https://github.com/cooperhewitt/collection | Museums |
Minneapolis Institute of Arts metadata | https://github.com/artsmia/collection | Museums |
Natural History Museum (London) Data Portal | http://data.nhm.ac.uk/ | Museums |
Rijksmuseum Historical Art Collection | https://www.rijksmuseum.nl/en/api | Museums |
Tate Collection metadata | https://github.com/tategallery/collection | Museums |
The Getty vocabularies | http://vocab.getty.edu/ | Museums |
POS/NER/Chunk annotated data | https://github.com/aritter/twitter_nlp/tree/master/data/annotated | Natural Language |
Automatic Keyphrase Extraction | https://github.com/snkim/AutomaticKeyphraseExtraction/ | Natural Language |
Blogger Corpus | http://u.cs.biu.ac.il/~koppel/BlogCorpus.htm | Natural Language |
CLiPS Stylometry Investigation Corpus | http://www.clips.uantwerpen.be/datasets/csi-corpus | Natural Language |
ClueWeb09 FACC | http://lemurproject.org/clueweb09/FACC1/ | Natural Language |
ClueWeb12 FACC | http://lemurproject.org/clueweb12/FACC1/ | Natural Language |
DBpedia - 4.58M things with 583M facts | http://wiki.dbpedia.org/Datasets | Natural Language |
Flickr Personal Taxonomies | http://www.isi.edu/~lerman/downloads/flickr/flickr_taxonomies.html | Natural Language |
Freebase.com of people, places, and things | http://www.freebase.com/ | Natural Language |
Google Books Ngrams (2.2TB) | https://aws.amazon.com/datasets/google-books-ngrams/ | Natural Language |
Google MC-AFP, generated based on the public available Gigaword dataset using Paragraph Vectors | https://github.com/google/mcafp | Natural Language |
Google Web 5gram (1TB, 2006) | https://catalog.ldc.upenn.edu/LDC2006T13 | Natural Language |
Gutenberg eBooks List | http://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs | Natural Language |
Hansards text chunks of Canadian Parliament | http://www.isi.edu/natural-language/download/hansard/ | Natural Language |
Machine Comprehension Test (MCTest) of text from Microsoft Research | http://research.microsoft.com/en-us/um/redmond/projects/mctest/index.html | Natural Language |
Machine Translation of European languages | http://statmt.org/wmt11/translation-task.html#download | Natural Language |
Making Sense of Microposts 2013 - Concept Extraction | http://oak.dcs.shef.ac.uk/msm2013/challenge.html | Natural Language |
Making Sense of Microposts 2016 - Named Entity rEcognition and Linking | http://microposts2016.seas.upenn.edu/challenge.html | Natural Language |
Microsoft MAchine Reading COmprehension Dataset (or MS MARCO) | http://www.msmarco.org/dataset.aspx | Natural Language |
Multi-Domain Sentiment Dataset (version 2.0) | http://www.cs.jhu.edu/~mdredze/datasets/sentiment/ | Natural Language |
Open Multilingual Wordnet | http://compling.hss.ntu.edu.sg/omw/ | Natural Language |
Personae Corpus | http://www.clips.uantwerpen.be/datasets/personae-corpus | Natural Language |
SaudiNewsNet Collection of Saudi Newspaper Articles (Arabic, 30K articles) | https://github.com/ParallelMazen/SaudiNewsNet | Natural Language |
SMS Spam Collection in English | http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/ | Natural Language |
Universal Dependencies | http://universaldependencies.org/ | Natural Language |
USENET postings corpus of 2005~2011 | http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html | Natural Language |
Webhose - News/Blogs in multiple languages | https://webhose.io/datasets | Natural Language |
Wikidata - Wikipedia databases | https://www.wikidata.org/wiki/Wikidata:Database_download | Natural Language |
Wikipedia Links data - 40 Million Entities in Context | https://code.google.com/p/wiki-links/downloads/list | Natural Language |
WordNet databases and tools | http://wordnet.princeton.edu/wordnet/download/ | Natural Language |
Brain Catalogue | http://braincatalogue.org/ | Neuroscience |
Brainomics | http://brainomics.cea.fr/localizer | Neuroscience |
CodeNeuro Datasets | http://datasets.codeneuro.org/ | Neuroscience |
Collaborative Research in Computational Neuroscience (CRCNS) | http://crcns.org/data-sets | Neuroscience |
FCP-INDI | http://fcon_1000.projects.nitrc.org/index.html | Neuroscience |
Human Connectome Project | http://www.humanconnectome.org/data/ | Neuroscience |
NDAR | https://ndar.nih.gov/ | Neuroscience |
NeuroData | http://neurodata.io/ | Neuroscience |
Neuroelectro | http://neuroelectro.org/ | Neuroscience |
NIMH Data Archive | http://data-archive.nimh.nih.gov/ | Neuroscience |
OASIS | http://www.oasis-brains.org/ | Neuroscience |
OpenfMRI | https://openfmri.org/ | Neuroscience |
Study Forrest | http://studyforrest.org/ | Neuroscience |
CERN Open Data Portal | http://opendata.cern.ch/ | Physics |
Crystallography Open Database | http://www.crystallography.net/ | Physics |
NASA Exoplanet Archive | http://exoplanetarchive.ipac.caltech.edu/ | Physics |
NSSDC (NASA) data of 550 space spacecraft | http://nssdc.gsfc.nasa.gov/nssdc/obtaining_data.html | Physics |
Sloan Digital Sky Survey (SDSS) - Mapping the Universe | http://www.sdss.org/ | Physics |
OSU Cognitive Modeling Repository Datasets | http://www.cmr.osu.edu/browse/datasets | Psychology/Cognition |
Amazon | http://aws.amazon.com/datasets/ | Public Domains |
Archive-it from Internet Archive | https://www.archive-it.org/explore?show=Collections | Public Domains |
Archive.org Datasets | https://archive.org/details/datasets | Public Domains |
CMU JASA data archive | http://lib.stat.cmu.edu/jasadata/ | Public Domains |
CMU StatLab collections | http://lib.stat.cmu.edu/datasets/ | Public Domains |
Data.World | https://data.world/ | Public Domains |
Data360 | http://www.data360.org/index.aspx | Public Domains |
http://www.google.com/publicdata/directory | Public Domains | |
Infochimps | http://www.infochimps.com/ | Public Domains |
KDNuggets Data Collections | http://www.kdnuggets.com/datasets/index.html | Public Domains |
Microsoft Azure Data Market Free DataSets | http://datamarket.azure.com/browse/data?price=free | Public Domains |
Microsoft Data Science for Research | http://aka.ms/Data-Science | Public Domains |
Numbray | http://numbrary.com/ | Public Domains |
Open Library Data Dumps | https://openlibrary.org/developers/dumps | Public Domains |
Reddit Datasets | https://www.reddit.com/r/datasets | Public Domains |
RevolutionAnalytics Collection | http://packages.revolutionanalytics.com/datasets/ | Public Domains |
Sample R data sets | http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/00Index.html | Public Domains |
Stats4Stem R data sets | http://www.stats4stem.org/data-sets.html | Public Domains |
StatSci.org | http://www.statsci.org/datasets.html | Public Domains |
The Washington Post List | http://www.washingtonpost.com/wp-srv/metro/data/datapost.html | Public Domains |
UCLA SOCR data collection | http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data | Public Domains |
UFO Reports | http://www.nuforc.org/webreports.html | Public Domains |
Wikileaks 911 pager intercepts | https://911.wikileaks.org/files/index.html | Public Domains |
Yahoo Webscope | http://webscope.sandbox.yahoo.com/catalog.php | Public Domains |
Academic Torrents of data sharing from UMB | http://academictorrents.com/ | Search Engines |
Datahub.io | https://datahub.io/dataset | Search Engines |
DataMarket (Qlik) | https://datamarket.com/data/list/?q=all | Search Engines |
Harvard Dataverse Network of scientific data | https://dataverse.harvard.edu/ | Search Engines |
ICPSR (UMICH) | http://www.icpsr.umich.edu/icpsrweb/ICPSR/index.jsp | Search Engines |
Institute of Education Sciences | http://eric.ed.gov/ | Search Engines |
National Technical Reports Library | http://www.ntis.gov/products/ntrl/ | Search Engines |
Open Data Certificates (beta) | https://certificates.theodi.org/en/datasets | Search Engines |
OpenDataNetwork - A search engine of all Socrata powered data portals | http://www.opendatanetwork.com/ | Search Engines |
Statista.com - statistics and Studies | http://www.statista.com/ | Search Engines |
Zenodo - An open dependable home for the long-tail of science | https://zenodo.org/collection/datasets | Search Engines |
72 hours #gamergate Twitter Scrape | http://waxy.org/random/misc/gamergate_tweets.csv | Social Networks |
Ancestry.com Forum Dataset over 10 years | http://www.cs.cmu.edu/~jelsas/data/ancestry.com/ | Social Networks |
Cheng-Caverlee-Lee September 2009 - January 2010 Twitter Scrape | https://archive.org/details/twitter_cikm_2010 | Social Networks |
CMU Enron Email of 150 users | http://www.cs.cmu.edu/~enron/ | Social Networks |
EDRM Enron EMail of 151 users, hosted on S3 | https://aws.amazon.com/datasets/enron-email-data/ | Social Networks |
Facebook Data Scrape (2005) | https://archive.org/details/oxford-2005-facebook-matrix | Social Networks |
Facebook Social Networks from LAW (since 2007) | http://law.di.unimi.it/datasets.php | Social Networks |
Foursquare from UMN/Sarwat (2013) | https://archive.org/details/201309_foursquare_dataset_umn | Social Networks |
GitHub Collaboration Archive | https://www.githubarchive.org/ | Social Networks |
Google Scholar citation relations | http://www3.cs.stonybrook.edu/~leman/data/gscholar.db | Social Networks |
High-Resolution Contact Networks from Wearable Sensors | http://www.sociopatterns.org/datasets/ | Social Networks |
Indie Map: social graph and crawl of top IndieWeb sites | http://www.indiemap.org/ | Social Networks |
Mobile Social Networks from UMASS | https://kdl.cs.umass.edu/display/public/Mobile+Social+Networks | Social Networks |
Network Twitter Data | http://snap.stanford.edu/data/higgs-twitter.html | Social Networks |
Reddit Comments | https://www.reddit.com/r/datasets/comments/3bxlg7/i_have_every_publicly_available_reddit_comment/ | Social Networks |
Skytrax' Air Travel Reviews Dataset | https://github.com/quankiquanki/skytrax-reviews-dataset | Social Networks |
Social Twitter Data | http://snap.stanford.edu/data/egonets-Twitter.html | Social Networks |
SourceForge.net Research Data | http://www3.nd.edu/~oss/Data/data.html | Social Networks |
Twitter Data for Online Reputation Management | http://nlp.uned.es/replab2013/ | Social Networks |
Twitter Data for Sentiment Analysis | http://help.sentiment140.com/for-students/ | Social Networks |
Twitter Graph of entire Twitter site | http://an.kaist.ac.kr/traces/WWW2010.html | Social Networks |
Twitter Scrape Calufa May 2011 | http://archive.org/details/2011-05-calufa-twitter-sql | Social Networks |
UNIMI/LAW Social Network Datasets | http://law.di.unimi.it/datasets.php | Social Networks |
Yahoo! Graph and Social Data | http://webscope.sandbox.yahoo.com/catalog.php?datatype=g | Social Networks |
Youtube Video Social Graph in 2007,2008 | http://netsg.cs.sfu.ca/youtubedata/ | Social Networks |
ACLED (Armed Conflict Location & Event Data Project) | http://www.acleddata.com/ | Social Sciences |
Canadian Legal Information Institute | https://www.canlii.org/en/index.php | Social Sciences |
Center for Systemic Peace Datasets - Conflict Trends, Polities, State Fragility, etc | http://www.systemicpeace.org/ | Social Sciences |
Correlates of War Project | http://www.correlatesofwar.org/ | Social Sciences |
Cryptome Conspiracy Theory Items | http://cryptome.org/ | Social Sciences |
Datacards | http://datacards.org/ | Social Sciences |
European Social Survey | http://www.europeansocialsurvey.org/data/ | Social Sciences |
FBI Hate Crime 2013 - aggregated data | https://github.com/emorisse/FBI-Hate-Crime-Statistics/tree/master/2013 | Social Sciences |
Fragile States Index | http://fsi.fundforpeace.org/data | Social Sciences |
GDELT Global Events Database | http://gdeltproject.org/data.html | Social Sciences |
General Social Survey (GSS) since 1972 | http://gss.norc.org/ | Social Sciences |
German Social Survey | http://www.gesis.org/en/home/ | Social Sciences |
Global Religious Futures Project | http://www.globalreligiousfutures.org/ | Social Sciences |
Humanitarian Data Exchange | https://data.hdx.rwlabs.org/ | Social Sciences |
INFORM Index for Risk Management | http://www.inform-index.org/Results/Global | Social Sciences |
Institute for Demographic Studies | http://www.ined.fr/en/ | Social Sciences |
International Networks Archive | http://www.princeton.edu/~ina/ | Social Sciences |
International Social Survey Program ISSP | http://www.issp.org/ | Social Sciences |
International Studies Compendium Project | http://www.isacompendium.com/public/ | Social Sciences |
James McGuire Cross National Data | http://jmcguire.faculty.wesleyan.edu/welcome/cross-national-data/ | Social Sciences |
MacroData Guide by Norsk samfunnsvitenskapelig datatjeneste | http://nsd.uib.no/ | Social Sciences |
Minnesota Population Center | https://www.ipums.org/ | Social Sciences |
MIT Reality Mining Dataset | http://realitycommons.media.mit.edu/realitymining.html | Social Sciences |
Notre Dame Global Adaptation Index (NG-DAIN) | http://index.gain.org/about/download | Social Sciences |
Open Crime and Policing Data in England, Wales and Northern Ireland | https://data.police.uk/data/ | Social Sciences |
Paul Hensel General International Data Page | http://www.paulhensel.org/dataintl.html | Social Sciences |
PewResearch Internet Survey Project | http://www.pewinternet.org/datasets/pages/2/ | Social Sciences |
PewResearch Society Data Collection | http://www.pewresearch.org/data/download-datasets/ | Social Sciences |
Political Polarity Data | http://www3.cs.stonybrook.edu/~leman/data/14-icwsm-political-polarity-data.zip | Social Sciences |
StackExchange Data Explorer | http://data.stackexchange.com/help | Social Sciences |
Terrorism Research and Analysis Consortium | http://www.trackingterrorism.org/ | Social Sciences |
Texas Inmates Executed Since 1984 | http://www.tdcj.state.tx.us/death_row/dr_executed_offenders.html | Social Sciences |
Titanic Survival Data Set or on Kaggle | https://www.kaggle.com/c/titanic/data | Social Sciences |
UCB's Archive of Social Science Data (D-Lab) | http://ucdata.berkeley.edu/ | Social Sciences |
UCLA Social Sciences Data Archive | http://dataarchives.ss.ucla.edu/Home.DataPortals.htm | Social Sciences |
UN Civil Society Database | http://esango.un.org/civilsociety/ | Social Sciences |
Universities Worldwide | http://univ.cc/ | Social Sciences |
UPJOHN for Labor Employment Research | http://www.upjohn.org/services/resources/employment-research-data-center | Social Sciences |
Uppsala Conflict Data Program | http://ucdp.uu.se/ | Social Sciences |
World Bank Open Data | http://data.worldbank.org/ | Social Sciences |
WorldPop project - Worldwide human population distributions | http://www.worldpop.org.uk/data/get_data/ | Social Sciences |
FLOSSmole data about free, libre, and open source software development | http://flossdata.syr.edu/data/ | Software |
Betfair Historical Exchange Data | http://data.betfair.com/ | Sports |
Cricsheet Matches (cricket) | http://cricsheet.org/ | Sports |
Ergast Formula 1, from 1950 up to date (API) | http://ergast.com/mrd/db | Sports |
Football/Soccer resources (data and APIs) | http://www.jokecamp.com/blog/guide-to-football-and-soccer-data-and-apis/ | Sports |
Lahman's Baseball Database | http://www.seanlahman.com/baseball-archive/statistics/ | Sports |
Pinhooker: Thoroughbred Bloodstock Sale Data | https://github.com/phillc73/pinhooker | Sports |
Retrosheet Baseball Statistics | http://www.retrosheet.org/game.htm | Sports |
Tennis database of rankings, results, and stats for ATP, WTA, Grand Slams and Match Charting Project | https://github.com/JeffSackmann/tennis_atp | Sports |
Databanks International Cross National Time Series Data Archive | http://www.cntsdata.com/ | Time Series |
Hard Drive Failure Rates | https://www.backblaze.com/hard-drive-test-data.html | Time Series |
Heart Rate Time Series from MIT | http://ecg.mit.edu/time-series/ | Time Series |
Time Series Data Library (TSDL) from MU | https://datamarket.com/data/list/?q=provider:tsdl | Time Series |
UC Riverside Time Series Dataset | http://www.cs.ucr.edu/~eamonn/time_series_data/ | Time Series |
Airlines OD Data 1987-2008 | http://stat-computing.org/dataexpo/2009/the-data.html | Transportation |
Bay Area Bike Share Data | http://www.bayareabikeshare.com/open-data | Transportation |
Bike Share Systems (BSS) collection | https://github.com/BetaNYC/Bike-Share-Data-Best-Practices/wiki/Bike-Share-Data-Systems | Transportation |
GeoLife GPS Trajectory from Microsoft Research | http://research.microsoft.com/en-us/downloads/b16d359d-d164-469e-9fd4-daa38f2b2e13/ | Transportation |
German train system by Deutsche Bahn | http://data.deutschebahn.com/datasets/ | Transportation |
Hubway Million Rides in MA | http://hubwaydatachallenge.org/trip-history-data/ | Transportation |
Marine Traffic - ship tracks, port calls and more | http://www.marinetraffic.com/de/ais-api-services | Transportation |
Montreal BIXI Bike Share | https://montreal.bixi.com/en/open-data | Transportation |
NYC Taxi Trip Data 2009- | http://www.nyc.gov/html/tlc/html/about/trip_record_data.shtml | Transportation |
NYC Taxi Trip Data 2013 (FOIA/FOILed) | https://archive.org/details/nycTaxiTripData2013 | Transportation |
NYC Uber trip data April 2014 to September 2014 | https://github.com/fivethirtyeight/uber-tlc-foil-response | Transportation |
Open Traffic collection | https://github.com/graphhopper/open-traffic-collection | Transportation |
OpenFlights - airport, airline and route data | http://openflights.org/data.html | Transportation |
Philadelphia Bike Share Stations (JSON) | https://www.rideindego.com/stations/json/ | Transportation |
Plane Crash Database, since 1920 | http://www.planecrashinfo.com/database.htm | Transportation |
RITA Airline On-Time Performance data | http://www.transtats.bts.gov/Tables.asp?DB_ID=120 | Transportation |
RITA/BTS transport data collection (TranStat) | http://www.transtats.bts.gov/DataIndex.asp | Transportation |
Toronto Bike Share Stations (XML file) | http://www.bikesharetoronto.com/data/stations/bikeStations.xml | Transportation |
Transport for London (TFL) | https://tfl.gov.uk/info-for/open-data-users/our-open-data | Transportation |
Travel Tracker Survey (TTS) for Chicago | http://www.cmap.illinois.gov/data/transportation/travel-tracker-survey | Transportation |
U.S. Bureau of Transportation Statistics (BTS) | http://www.rita.dot.gov/bts/ | Transportation |
U.S. Domestic Flights 1990 to 2009 | http://academictorrents.com/details/a2ccf94bbb4af222bf8e69dad60a68a29f310d9a | Transportation |
U.S. Freight Analysis Framework since 2007 | http://ops.fhwa.dot.gov/freight/freight_analysis/faf/index.htm | Transportation |
Data Packaged Core Datasets | https://github.com/datasets/ | Complementary Collections |
Database of Scientific Code Contributions | https://mozillascience.org/collaborate | Complementary Collections |
A growing collection of public datasets: CoolDatasets. | http://cooldatasets.com/ | Complementary Collections |
DataWrangling: Some Datasets Available on the Web | http://www.datawrangling.com/some-datasets-available-on-the-web | Complementary Collections |
Inside-r: Finding Data on the Internet | http://www.inside-r.org/howto/finding-data-internet | Complementary Collections |
OpenDataMonitor: An overview of available open data resources in Europe | http://opendatamonitor.eu/ | Complementary Collections |
Quora: Where can I find large datasets open to the public? | http://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public | Complementary Collections |
RS.io: 100+ Interesting Data Sets for Statistics | http://rs.io/100-interesting-data-sets-for-statistics/ | Complementary Collections |
StaTrek: Leveraging open data to understand urban lives | http://xiaming.me/posts/2014/10/23/leveraging-open-data-to-understand-urban-lives/ | Complementary Collections |
.
Data Science Tools Database
Database: Data Science Tools
Know something related to data science that you recommend? Please add it so that others can benefit from it! You can add to the database here.Filter by primary & secondary category filters:
Name | Link | Description | Tool Category | |
---|---|---|---|---|
ChartBlocks | https://www.chartblocks.com/ | Build charts online with the easy to use ChartBlock chart designer interface. Upload your data then set to work designing your chart. | Visualisation | |
Charts js | http://www.chartjs.org/ | Simple, clean and engaging HTML5 based JavaScript charts. Chart.js is an easy way to include animated, interactive graphs on your website for free. | Visualisation | |
Caffe | https://caffe.berkeleyvision.org/ | Caffe is a deep learning framework made with expression, speed, and modularity in mind. It is developed by Berkeley AI Research (BAIR) and by community contributors. Yangqing Jia created the project during his PhD at UC Berkeley. Caffe is released under the BSD 2-Clause license. | Machine Learning | Framework |
Carto | https://carto.com/ | CARTO is the platform for turning location data into business outcomes. | Visualisation | |
D3js | https://d3js.org/ | D3.js (Data Driven Documents) is a JavaScript library for manipulating documents based on data. D3 helps you bring data to life using HTML, SVG, and CSS. D3’s emphasis on web standards gives you the full capabilities of modern browsers without tying yourself to a proprietary framework, combining powerful visualization components and a data-driven approach to DOM manipulation. | Visualisation | |
Data Science Experience | https://datascience.ibm.com/ | This is a test to see how the table adapts to long descriptions. It could be a problem if the description is really really long | Collaboration | |
Data wrapper | https://www.datawrapper.de/ | Datawrapper is an open source tool helping everyone to create simple, correct and embeddable charts in minutes. Paste or upload CSV, and graph or map created! | Visualisation | |
Dygraphs | dygraphs.com/ | Dygraphs is a fast, flexible open source JavaScript charting library. It allows users to explore and interpret dense / LARGE data sets. | Visualisation | |
Fusion Charts | https://www.fusioncharts.com/ | JavaScript charts for web and mobile apps. 90+ chart types. Fast, responsive and highly customizable. Supports all browsers. Even IE6! | Visualisation | |
Gephi | https://gephi.org/ | Gephi is the leading visualization and exploration software for all kinds of graphs and networks. Gephi is open-source and free. | Visualisation | |
Github | https://github.com/ | GitHub is a development platform inspired by the way you work. From open source to business, you can host and review code, manage projects, and build software alongside millions of other developers. | Repository | |
Google Charts | https://developers.google.com/chart/ | Interactive charts for browsers and mobile devices. | Visualisation | |
Google Data Studio | https://datastudio.google.com/ | Google's free data visualisation platform. | Visualisation | |
Google Tables | https://research.google.com/tables | Search the web for table data | Data Source | |
Google fusion tables | https://sites.google.com/site/fusiontablestalks/stories | Fusion Tables is an experimental data visualization web application to gather, visualize, and share data tables. | Database | |
Google Maps | https://www.google.com/maps/about/mymaps/ | Make maps. Easily create custom maps with the places that matter to you. | Visualisation | |
Gretl | gretl.sourceforge.net/ | gretl is an open-source statistical package, mainly for econometrics. The name is an acronym for Gnu Regression, Econometrics and Time-series Library. | Statistics | |
Hadoop (Apache) | http://hadoop.apache.org/ | The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, the library itself is designed to detect and handle failures at the application layer, so delivering a highly-available service on top of a cluster of computers, each of which may be prone to failures. | Big Data | |
High Charts (Cloud tool) | https://cloud.highcharts.com/ | High charts cloud based tool for quick chart creation. | Visualisation | |
Horton Works Data Platform (HDP) | https://hortonworks.com/products/data-platforms/hdp/ | The Hortonworks Data Platform, powered by Apache Hadoop, is a massively scalable and 100% open source platform for storing, processing and analyzing large volumes of data. It is designed to deal with data from many sources and formats in a very quick, easy and cost-effective manner. | Tool | |
IBM Cloud (Previously Bluemix) | https://www.ibm.com/cloud/get-started | cloud platform as a service (PaaS) developed by IBM. It supports several programming languages and services[1] as well as integrated DevOps to build, run, deploy and manage applications on the cloud. Bluemix is based on Cloud Foundry open technology and runs on SoftLayer infrastructure. Bluemix supports several programming languages[2] including Java, Node.js, Go, PHP, Swift, Python, Ruby Sinatra, Ruby on Rails and can be extended to support other languages such as Scala[3] through the use of buildpacks. | Database | |
IBM Watson API's | https://www.ibm.com/watson/products-services/ | Build with IBM's Watson API's | Tool | AI |
JSON | http://www.json.org/ | JSON (JavaScript Object Notation) is a lightweight data-interchange format. It is easy for humans to read and write. It is easy for machines to parse and generate. | Language | |
Jupyter Notebooks | http://jupyter.org/ | The Jupyter Notebook is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Uses include: data cleaning and transformation, numerical simulation, statistical modeling, data visualization, machine learning, and much more. | Collaboration | |
Keras | https://keras.io/ | Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation. | Machine Learning | Library |
Kubernetes | https://kubernetes.io/ | Kubernetes (K8s) is an open-source system for automating deployment, scaling, and management of containerized applications. | Tool | Containerisation |
Leaflet | leafletjs.com/ | Leaflet is the leading open-source JavaScript library for mobile-friendly interactive maps. | Visualisation | |
OpenRefine | http://openrefine.org/ | A free, open source, power tool for working with messy data. | Data Cleaning | |
Plotly | https://plot.ly/ | Visualize Data, Together. Plotly lets users easily create interactive charts and dashboards to share online with their audience. | Visualisation | |
Python | https://www.python.org/ | The most popular language for data science | Language | |
Quid | https://quid.com/ | Quid is a platform that searches, analyzes and visualizes the world's collective intelligence to help answer strategic questions. Beautiful network visualisations. | Visualisation | |
R | https://www.r-project.org/ | R is a free software environment for statistical computing and graphics. | Language | |
R Shiny | https://shiny.rstudio.com/ | Shiny is an R package that makes it easy to build interactive web apps straight from R. You can host standalone apps on a webpage or embed them in R Markdown documents or build dashboards. You can also extend your Shiny apps with CSS themes, htmlwidgets, and JavaScript actions. | Tool | Visualization |
Raw Graphs | http://rawgraphs.io/ | "The missing link between spreadsheets and data visualisation" | Visualisation | |
Sigma js | http://sigmajs.org/ | Sigma is a JavaScript library dedicated to graph drawing. It makes easy to publish networks on Web pages, and allows developers to integrate network exploration in rich Web applications. | Visualisation | |
Spark (Apache) | https://spark.apache.org/ | Lightning-fast cluster computing for big data. A fast and general engine for large-scale data processing. | Big Data | |
Tableau | https://www.tableau.com/ | Visualisation tool | Visualisation | |
TensorFlow | https://www.tensorflow.org/ | An open-source software library for Machine Intelligence. | Machine Learning | |
Timeline js | https://timeline.knightlab.com/ | Timeline JS is a free, easy-to-use tool for telling stories in a timeline format. | Visualisation | |
Watson Visual Recognition | https://visual-recognition-demo.ng.bluemix.net/ | Watson API for Image recognition | AI | |
Sphinx | http://sphinx-doc.org/ | Sphinx is a documentation generator written and used by the Python community. It is written in Python, and also used in other environments. | Documentation | |
Name | Link | Description | Secondary Category |
I see you don’t monetize anthonyhuntley.com, don’t waste your traffic, you can earn extra cash every month with new monetization method.
This is the best adsense alternative for any
type of website (they approve all sites), for more details simply search in gooogle: murgrabia’s
tools