SQL Serverインデックスと統計

違いは何ですかCREATE INDEXとCREATE STATISTICS私はそれぞれ使用すべきでは？

sql-server index statistics

— スコット
ソース

インデックスには実際のデータ（データページまたはインデックスページの種類に応じたインデックスページ）が格納され、統計にはデータの分布が格納されます。したがって、CREATE INDEXインデックス（クラスター化、非クラスター化など）CREATE STATISTICSを作成するDDLになり、テーブル内の列の統計を作成するDDLになります。

リレーショナルデータのこれらの側面について読むことをお勧めします。以下は、初心者向けの入門記事です。これらは非常に広範なトピックであり、したがって、それらに関する情報は非常に幅広く、非常に深くなる可能性があります。以下でそれらの一般的な考えを読んで、それらが生じたらより具体的な質問をしてください。

テーブルとインデックスの構成に関するBOLリファレンス
 構成に関するBOLリファレンスクラスター化インデックス構造に関する
 BOLリファレンス非クラスター化インデックス構造に関するBOLリファレンス
 インデックスの概要に関するSQL Server Central
統計に関するBOLの参照

これらの2つの部分が動作しているのを確認するための実用的な例を次に示します（説明のためにコメントを付けています）。

use testdb;
go

create table MyTable1
(
    id int identity(1, 1) not null,
    my_int_col int not null
);
go

insert into MyTable1(my_int_col)
values(1);
go 100

-- this statement will create a clustered index
-- on MyTable1.  The index key is the id field
-- but due to the nature of a clustered index
-- it will contain all of the table data
create clustered index MyTable1_CI
on MyTable1(id);
go


-- by default, SQL Server will create a statistics
-- on this index.  Here is proof.  We see a stat created
-- with the name of the index, and the consisting stat 
-- column of the index key column
select
    s.name as stats_name,
    c.name as column_name
from sys.stats s
inner join sys.stats_columns sc
on s.object_id = sc.object_id
and s.stats_id = sc.stats_id
inner join sys.columns c
on sc.object_id = c.object_id
and sc.column_id = c.column_id
where s.object_id = object_id('MyTable1');


-- here is a standalone statistics on a single column
create statistics MyTable1_MyIntCol
on MyTable1(my_int_col);
go

-- now look at the statistics that exist on the table.
-- we have the additional statistics that's not necessarily
-- corresponding to an index
select
    s.name as stats_name,
    c.name as column_name
from sys.stats s
inner join sys.stats_columns sc
on s.object_id = sc.object_id
and s.stats_id = sc.stats_id
inner join sys.columns c
on sc.object_id = c.object_id
and sc.column_id = c.column_id
where s.object_id = object_id('MyTable1');


-- what is a stat look like?  run DBCC SHOW_STATISTICS
-- to get a better idea of what is stored
dbcc show_statistics('MyTable1', 'MyTable1_CI');
go

統計のテストサンプルは次のようになります。

ここに画像の説明を入力してください

統計はデータ分布の包含であることに注意してください。SQL Serverが最適な計画を決定するのに役立ちます。これの良い例は、あなたが重い物体に命を吹き込んでいると想像してください。体重マークが付いているためにその体重がわかっていれば、持ち上げる最適な方法と筋肉を決定できます。これは、SQL Serverが統計を使用して行うことです。

-- create a nonclustered index
-- with the key column as my_int_col
create index IX_MyTable1_MyIntCol
on MyTable1(my_int_col);
go

-- let's look at this index
select
    object_name(object_id) as object_name,
    name as index_name,
    index_id,
    type_desc,
    is_unique,
    fill_factor
from sys.indexes
where name = 'IX_MyTable1_MyIntCol';

-- now let's see some physical aspects
-- of this particular index
-- (I retrieved index_id from the above query)
select *
from sys.dm_db_index_physical_stats
(
    db_id('TestDB'),
    object_id('MyTable1'),
    4,
    null,
    'detailed'
);

上記の例から、インデックスには実際にデータが含まれていることがわかります（インデックスのタイプによって、リーフページは異なります）。

この投稿では、これら2つの大規模な非常に簡単な概要のみを示しています。 SQL Serverの側面のます。これらの両方は、章と本を取り上げることができました。いくつかの参考文献を読んでください。そうすれば、よりよく理解できるでしょう。

— トーマス・ストリンガー
ソース

これは古い投稿であることは知っていますが、インデックスを作成すると（ほとんどの場合）インデックスの統計が自動的に生成されることに注目してください。統計の作成についても同じことが言えません。

— スティーブMangiameli