Questions tagged [data-warehouse]

A database system optimised for reporting, particularly in aggregate. Often, but not always implemented using a star schema.

A Data Warehouse is a specialised database system that is optimised for reporting or at least easy extraction of data. One important point is that the data in a data warehouse is usually loaded from external systems, and the database design has significant differences to those used for transactional systems. Data warehouse systems often have several characteristic features:

Use of star schemas with central fact tables joining to dimension tables. These facilitate fast aggregate reporting and simple query plans. Sometimes other designs such as snowflake schemas or relatively normalised operational data store databases are employed.
Storage of historical data - often data warehouse systems are used for analytical queries that examine historical data, or trends in aggregate figures over time.
ETL processes to load the data from external sources. Data warehouse systems often function to aggregate data from multiple sources.
Conformed data - data from multiple sources is often transformed into a common format, allowing data from multiple sources to be queried at the same time.

409 questions

votes

5 answers

What are some ways to implement a many-to-many relationship in a data warehouse?

The dominant topologies of Data Warehouse modelling (Star, Snowflake) are designed with one-to-many relationships in mind. Query readability, performance, and structure degrades severely when faced with a many-to-many relationship in these modelling…

database-design data-warehouse

asked Jan 04 '11 at 06:25

Brian Ballsun-Stanton

4,683
2
27
36

votes

3 answers

What are the arguments in favor of using ELT process over ETL?

I realized that my company uses an ELT (extract-load-transform) process instead of using an ETL (extract-transform-load) process. What are the differences in the two approaches and in which situations would one be "better" than the other? It would…

data-warehouse etl business-intelligence

asked Jun 14 '12 at 08:24

What'sUP

votes

4 answers

Compare two similar Postgres databases for differences

I occasionally download publicly available data sets in the form of Postgres dBs. These datasets are updated/modified/expanded over time by the repository host. Is there a Postgres command or tool (ideally FOSS) that can show the differences…

postgresql data-warehouse

asked Jun 05 '14 at 14:17

CuriousGorge

votes

3 answers

Clustered columnstore indexes and foreign keys

I am performance tuning a data warehouse using indexes. I am fairly new to SQL Server 2014.Microsoft describes the following: "We view the clustered columnstore index as the standard for storing large data warehousing fact tables, and expect it…

sql-server foreign-key data-warehouse sql-server-2014 columnstore

asked Oct 07 '14 at 08:39

OverflowStack

votes

2 answers

Difference between star schema and data cube?

I am involved in a new project, where I have to create data cube from the existing relational database system. I understood that, the existing system is not properly designed, I am not sure where to start. My question are: What is difference…

database-design data-warehouse

asked May 05 '17 at 09:52

Rathish Kumar B

2,006
5
19
34

votes

2 answers

Open Source Business Intelligence/DWH solutions

I wonder that this question hasn't already been asked. Google only has very few results for me that don't show a high quality tool What are some Open Source (also free is ok) solutions for Data Warehouses and more specifically Business Intelligence…

tools data-warehouse database-agnostic business-intelligence

asked Mar 02 '11 at 17:23

DrColossos

6,179
2
30
30

votes

1 answer

Query strategies using SQL Server 2016 system-versioned temporal tables for Slowly-Changing Dimensions

When using a system-versioned temporal table (new in SQL Server 2016), what are the query authoring and performance implications when this feature is used to handle Slowly Changing Dimensions in a large relational data warehouse? For example, assume…

sql-server data-warehouse slowly-changing-dimension sql-server-2016 temporal-tables

asked Jul 10 '15 at 23:05

Justin Grant

votes

2 answers

Handling time zones in data mart/warehouse

We are starting to design the building blocks of a data mart/warehouse and we need to be able to support all time zones (our clients are from all over the world). From reading discussions online (and in books), a common solution seems to be to have…

sql-server-2012 data-warehouse datetime timezone

asked Oct 18 '13 at 12:39

Vesselin Obreshkov

votes

2 answers

Alternative to EAV for dynamic fields in a star schema data warehouse

I need to support dynamic fields and values in a big datawarehouse for storing API requests log, my user case is that I need to store all API requests query string and able to perform query against them in the future (so it is not just storage, so I…

database-design data-warehouse eav star-schema redshift

asked May 07 '14 at 18:21

Howard

votes

2 answers

ETL: extracting from 200 tables - SSIS data flow or custom T-SQL?

Based on my analysis, a complete dimensional model for our data warehouse will require extraction from over 200 source tables. Some of these tables will be extracted as part of an incremental load and others will be a full load. To note, we have…

sql-server sql-server-2005 ssis data-warehouse etl

asked Dec 14 '12 at 21:39

8kb

2,539
2
27
35

votes

1 answer

Should I disable "auto update statistics" in a data warehousing scenario?

I have 200 GB data warehouse in SQL Server. I have been experiencing really slow execution times for some queries; for example 12 hours for a simple delete query with an inner join. After doing some research with the execution plans, I've updated…

sql-server data-warehouse statistics

asked Dec 11 '13 at 13:27

saso

votes

1 answer

Limit the degree of parallelism (DOP) available to any query

On Oracle Exadata (11gR2), we have a relatively beefy database. cpu_count is 24 parallel_server_instances is 2 parallel_threads_per_cpu is 2 We noted, through observation in Oracle Enterprise Manager (OEM), that performance was terrible due to…

oracle data-warehouse

asked Nov 29 '11 at 15:23

grenade

votes

3 answers

Is SQL Server data compression categorically good for read-only databases?

Some literature on SQL Server data compression I read state that the write cost increases to about four times what would normally be required. It also seems to imply that this is the primary downside to data compression, strongly implying that for…

sql-server sql-server-2012 data-warehouse compression

asked Apr 03 '13 at 21:48

孔夫子

4,258
3
26
49

votes

1 answer

How much RAM should I get for a cloud-hosted PostgreSQL data warehouse?

I'm looking at migrating a current PostgreSQL data warehouse to a cloud host with SSD storage and RAM as one of the main sizing variables. The bulkiest data we're dealing with at the moment will live on monthly partitioned tables. Each month is…

postgresql memory data-warehouse

asked Aug 15 '16 at 16:10

raphael

votes

2 answers

PostgreSQL for high volume transactions and for Data warehousing

Am quite new to PostgreSQL, I have never done a large deployment using it before. But, I have good experience in enterprise solutions and I want to try and apply some of what I learned using PostgreSQL. I have site which is sized to handle large…

postgresql data-warehouse etl

asked Jan 25 '12 at 20:07

Mo J. Mughrabi

2 3

…

27 28 Next