Data federation - room 101

Without data there is no story for HubScope to tell, the richer the data - through federation, the better the story

 
data_shared.png

Data base - typically used for small to medium amounts of structured data, providing a Schema-on-write approach (Schema then data)
Popular databases are : MSSQL, MySQL, Oracle, DB/2, PostgreSQL, MongoDB, Redis, Elasticsearch, Apache Cassandra

Data warehouse - typically a larger repository handling multiple aggregated data sources, takes a Schema-on-write approach (Schema then data)
Popular data warehouses are : Snowflake, Yellowbrick, Teradata

Data lake - used for big data sources where the data is stored in its native format, provide flexibility using Schema-on-Read access (Data then schema)
Popular data lakes are: neo4j, Hadoop

Data mart - typically focuses on one specialized subject matter or business unit, single use, fast and efficient. Achieved through any combination of the above


graph_claw.png

csv - flat file, no structure but a fixed comma separated format, optional header row. Common as a legacy input/output format

Excel - Popular and widely used, proprietary format using a fixed structure of workbooks, tabs, rows and columns. Easy user interface

JSON - self describing and both human and machine readable, no schema requirements

XML - self describing, more machine readable than human readable, often has schema constraints to reduce errors

SQL - data is stored against a define structure to maintain referential integrity. Specialized DBA skills required for complex SQL queries

neo4j - data is stored as nodes and edges, also known as a NoSQL database. Generalized cypher query language uses intuitive syntax