Data federation - room 101
Without data there is no story for HubScope to tell, the richer the data - through federation, the better the story
Data base - typically used for small to medium amounts of structured data, providing a Schema-on-write approach (Schema then data)
Popular databases are : MSSQL, MySQL, Oracle, DB/2, PostgreSQL, MongoDB, Redis, Elasticsearch, Apache Cassandra
Data warehouse - typically a larger repository handling multiple aggregated data sources, takes a Schema-on-write approach (Schema then data)
Popular data warehouses are : Snowflake, Yellowbrick, Teradata
Data lake - used for big data sources where the data is stored in its native format, provide flexibility using Schema-on-Read access (Data then schema)
Popular data lakes are: neo4j, Hadoop
Data mart - typically focuses on one specialized subject matter or business unit, single use, fast and efficient. Achieved through any combination of the above
csv - flat file, no structure but a fixed comma separated format, optional header row. Common as a legacy input/output format
Excel - Popular and widely used, proprietary format using a fixed structure of workbooks, tabs, rows and columns. Easy user interface
JSON - self describing and both human and machine readable, no schema requirements
XML - self describing, more machine readable than human readable, often has schema constraints to reduce errors
SQL - data is stored against a define structure to maintain referential integrity. Specialized DBA skills required for complex SQL queries
neo4j - data is stored as nodes and edges, also known as a NoSQL database. Generalized cypher query language uses intuitive syntax