Late start into 2014

Long time no post. There’s a number of reasons for that, all of them “overridden” by my laziness. 馃檪 I had a strong resolution of keeping good progress with the blog, but reality proved otherwise. Anyway, I am coming back to blogging since there are some ideas started or good to be shared.

SQL Server

First of all, SQL Server 2014 is publicly available since April 2014, and it’s hot. I’m not willing to copy/paste official marketing statements, but In-Memory OLTP (former Hekaton) and buffer pool extensions seem very promising, especially now, when the hardware prices are declining and there is more capacity available (in terms of both hard drive and memory). I tried it at CTP2 stage and I’m going to return聽to RTM for more testing action聽– I still haven’t experienced it live in production.
I also have some drafts of XML-related posts – how to process XML in SQL Server and mostly how to extract data to XML. This topic was relevant to me at some point and I had problems finding good comparison with execution statistics. Since I’m revisiting this, I’ll wrap it up in a series of posts.

Oracle

With a change in my job, I’m getting to know Oracle. I’m responsible for deploying an application that is using Database Engine and is hosted on WebLogic Server (both of them 11g). This is a completely new experience for me, so it’s possible some posts related to those areas, for example transition from SQL Server to Oracle.

ALM

Earlier on this year, I challenged myself and passed three exams that are required for MCSD: Application Lifecycle Management. I am starting to think in terms of ALM more and more. Last year I made some experiments with TFS 2013 while it was still in Preview stage, focusing on Continuous Integration and Continuous Delivery. Some time passed – TFS 2013 not only went RTM but already got Update 2 – and it might be reasonable to come back to that as Microsoft is incrementally adding new releases to each update. Another point worth noting here is that Microsoft is giving a TFS to everyone for free at visualstudio.com, of course with limited functionality but even the free plan gives you unlimited projects for up to 5 users, 60 minutes of build and 15000 virtual user load minutes, so a great place to start.

Others

Phase change cooling of Celeron 430 on DFI DK P35 馃檪 -50 Celsius on CPU done!


I came back actively to overclocking – I liked pushing an envelope a lot since my beginning with computers. I got myself a single stage phase change cooling unit (which is basically a stripped refrigerator – see the picture to the right) and a load of CPUs and RAM dating between 2002 and 2010. There’s going to be a lot of fun with that and for those who are interested you can follow me on hwbot.

And I decided to finally change the boring Andreas theme to something more modern. 馃檪

Building a clustered test lab with Windows Server 2012 RC and SQL Server 2012 RTM Evaluation

I have already described what my test lab at home looks like, however I find it interesting enough to provide details how to set it up. This post will be first in series and will cover planning and preparing the environment. Next posts will be about:

  • installing Windows Server 2012 in a VHD and聽enabling Hyper-V role
  • preparing virtual machine template and setting up virtual machines
  • preparing environment for setting up failover cluster聽
  • installing SQL Server on 2-node cluster

I am using concepts shown in a blog post by Jose Barreto describing this setup with Windows Server 2008 R2 and SQL Server 2008 R2.聽We will use Windows Server 2012 and SQL Server 2012 but main idea is the same.聽To build this lab you require:

  • a PC which:
    • has a CPU that supports virtualization in hardware (for Intel you want a CPU supporting VT-x technology, for AMD – AMD-V) – most聽non-low-end聽CPUs do
    • has minimum of 4 GB RAM (8 GB or more recommended)
    • has 50 GB free HDD space (up to you)
  • some software – all available for free:
    • Windows Server 2012 RC聽or Hyper-V Server 2012 RC聽installation media (I will use “full” Windows Server as I didn’t do it using command-line/PowerShell in Hyper-V Server) – I don’t recommend using VHD provided by Microsoft since it expands from 8 to 40 GB during boot time and you’d need lots more disk space ;
    • SQL Server 2012 Evaluation聽– valid from 180 days, but it’s more than enough for start (you might try Developer Edition if you have one – you don’t have to worry about expiration then)
  • whole procedure takes about 8 hours (depends on your system speed, mainly I/O, so if you’re blessed with RAID 0 on SSD drives you might actually do it much faster) – be warned!

What we’re going to do is we will install fresh Windows Server 2012 operating system in a VHD (so as not to mess with existing setup – you may install on a physical disk if you don’t mind, performance-wise it’s negligible) – I assume that a main OS on the PC we’re using is Windows 7/Server 2008 R2聽or newer, because XP does not support native-boot from VHD (not sure about Vista/Server 2008, but does anyone use it anymore? :P). Within this newly installed Server 2012 we will enable Hyper-V role and set up 4 virtual machines using one base template and four differencing disks. Then we will configure those machines like this:

  1. First machine is a domain controller and聽a DNS server – clustering requires聽a domain and Active Directory requires DNS.
  2. Second machine is storage controller – in the post by Jose (mentioned above) separate download of iSCSI Target is required for Windows Server 2008 R2, however it’s incorporated in Server 2012 – we don’t have to install anything, only enable additional OS features and configure them.
  3. Third machine is an active node of SQL Server failover cluster.
  4. Fourth machine is a passive node of SQL Server failover cluster.
  5. You may add more nodes to the cluster or consider setting up AlwaysOn.

This makes our lab completely standalone, so it can operate even if there is no connection to the outside. If it’s not required to have it this way, for example – you have a domain controller – you may skip creation of first VM. If you have a NAS/SAN, you may use it and skip the second machine. You may also consider combining first two machines into one (if you’re low on resources and don’t want to spawn a horde of machines). AD services in case of a domain so huge as this one will not be very demanding, definitely we’re not going to have a workload on storage impact a domian controller significantly. On the other hand you may also try setting up additional machines to see how it works – I made some with System Center 2012 RC, but you can also build a load-balanced SharePoint farm with a clustered database engine.

When it comes to network – no additional configuration will be done as clustering in Windows Server 2008 removed the need for dedicated heartbeat, so as long as our VMs see each other it will be fine (you may try to include DHCP server in this setup but remember to keep it聽contained so that it does not try to provide addresses in every network it may find). Just remember – live system may not have exactly the same architecture and whatever works for our homemade cluster may not be enough when it’s deployed into production (but hey, that’s a lab we’re building!).

When it comes to resource assignment for VMs I use the following pattern (I have Phenom II X4 CPU and 16 GB in my rig):

  1. For domain controller – 2 CPUs and 768 – 1024 MB RAM.
  2. For storage server – 2 CPUs and 768 MB RAM.
  3. For SQL Servers – 4 CPUs and 2048 – 3072 MB RAM.
  4. Rest – whatever I feel like/need.

When I’m setting this on a laptop where I have memory limitations (4 GB RAM) I cut down first two VMs to 512 MB RAM and SQL Servers to 1024 – 1280 MB. This allows me to remain operable, however sacrificing a lot of performance. Therefore 8 GB RAM is recommended.

And this concludes our preparation. Next post will get us through host setup – we will install Windows Server 2012 in a VHD and make it bootable so stay sharp!

Object whitespace-naming madness

Have you ever tried creating objects and giving them strange names? I made some experiment at one of lab servers, main focus was space (you know, the long key at the bottom of the keyboard :D).

So, without further ado – the first object to be tested is a database. Run:

create database [ ]

and you get your favourite Command(s) completed successfully response. A quick look into Object Explorer shows that a space-named database was indeed created. Let鈥檚 script it and see what a monster we just spawned:

USE [master]   
GO  /****** Object:  Database [ ]    Script Date: 02/15/2012 22:21:54 ******/   
CREATE DATABASE [ ] ON  PRIMARY     
( NAME = N'', FILENAME = N'C:\SqlData\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\ .mdf' , SIZE = 1280KB , MAXSIZE = UNLIMITED, FILEGROWTH = 1024KB )   
 LOG ON     
( NAME = N' _log', FILENAME = N'C:\SqlData\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\ _log.LDF' , SIZE = 504KB , MAXSIZE = 2048GB , FILEGROWTH = 10%)    
GO  ALTER DATABASE [ ] SET COMPATIBILITY_LEVEL = 100   
GO  IF (1 = FULLTEXTSERVICEPROPERTY('IsFullTextInstalled'))   

begin    
EXEC [ ].[dbo].[sp_fulltext_database] @action = 'enable'    
end    
...all the ALTERs go here...

The filenames on the HDD are .mdf (space.mdf) and _log.ldf (space_log.ldf), which are perfectly valid filenames as far as NTFS is concerned, but clearly database data file has a name of an empty string (weird). It seems that in this case some kind of trimming has taken place, but not in the case of database log file name which starts with a space.

We can go forward and try creating another space-named database. Try

create database [  ]  

Notice slight difference in both scripts 鈥 first database is called one space, while second is two spaces. The result is at least weird (first result from SQL Server 2005, second from 2008 R2):

Msg 1801, Level 16, State 3, Line 1  
Database 鈥  鈥 already exists. 

or

Msg 1801, Level 16, State 3, Line 1   
Database '  ' already exists. Choose a different database name.    

Funny, isn鈥檛 it? It seems as if SQL Server trims the database name of whitespaces before actually attempting to create such an object. I鈥檓 not wondering about added value that space-named databases might give, but certainly something is not right. If I could create a database with a name of 鈥 鈥, why not 鈥 鈥? Maybe other whitespaces will work as well?

In fact, they do. Let鈥檚 try with a bell character (ASCII code 7). First obstacle is that SSMS editor does not let you type in bell character, so you have to work our way around with standard T-SQL CHAR function. But then operating system will not let you create a file with bell inside of a name, so you have to change the file names. In the end, the code is like this:

declare @DBName sysname   
declare @createQuery nvarchar(2000)  select @DBName = CHAR(7)   
select @createQuery = '    
CREATE DATABASE [' + @DBName + '] ON  PRIMARY     
( NAME = N''1'', FILENAME = N''C:\SqlData\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\1.mdf'' , SIZE = 2048KB , FILEGROWTH = 1024KB )    
 LOG ON     
( NAME = N''1_log'', FILENAME = N''C:\SqlData\MSSQL10_50.MSSQLSERVER\MSSQL\DATA\1_log.ldf'' , SIZE = 1024KB , FILEGROWTH = 10%)    
'  execute sp_executesql @createQuery

I expected the bell character to act like one, so I opened SQLCMD and issued standard

SELECT * FROM sys.databases

But, to my surprise bell-character-named database is displayed as new line in the result set and my hopes for audible SQL are gone. 馃檪

Let’s go into our space-named database just to try to create a schema and a table inside with a column of type int. Try following piece of script:

use [ ]    
go    
create schema [ ]    
go    
create table [ ].[ ] ( [ ] int)    
go    

It really works, and the Object Explorer does prove it. You might actually test by issuing

INSERT INTO [ ].[ ].[ ].[ ] ([ ]) VALUES (1)

It would be even funnier if the column was of char type, then you could something like that (just don’t get mixed with delimiters and text escaping :P):

INSERT INTO [ ].[ ].[ ].[ ] ([ ]) VALUES (' ')

Wrapping it all up – since the post contained a lot of humour – I would rather expect whitespaces not to be permitted in object names at all, or at least that an object name has to contain one or more characters. For the sake of clarity I would even go further 鈥 to prevent crazy people from going rampant let object names be like variable names in most programming languages 鈥 contain only letters, digits and underscore, but cannot start with a digit.
On the other hand, you might say it’s just how flexible SQL Server is.

SQL Server Hardware Choices Made Easy by Glenn Berry – a quick review

I have come across Glenn Berry’s e-book by accident on twitter while ago by accident, downloaded it and then liked it very much. I have read it over one evening and first thing I did next day after arriving to work was forwarding it to all my coworkers.

What was the most interesting for me (as I share Glenn’s interest in hardware) was the detailed description of several interesting configurations using current hardware, as well as some future trends (like enterprise SSD drives, AMD Bulldozer or new Intel CPU). It just picks up where most of the books and guides leave you – for instance when most authors tell you to set up RAID-1 for logs and RAID-10 for database files, Glenn gives you an overview what actual models and what exact configuration may be applied and what is the cost of it. Furthermore, he shows what CPUs and in which configuration are preferred depending on your worklad (OLTP differs from OLAP). You will see that RAM is cheap and you don’t have to save on it. Last but not least, you will get a pocket-friendly suggestion on how to build a workstation for SQL Server.

If you have some second thoughts while picking up hardware for your new SQL Server, you have always bought 1-2-3-style machines (CPU/RAM/HDD count) or you’re simply interested in computer hardware and databases, this book is for you. Trust me, you won’t be disappointed.

[Story] Awful scripting

It happened to me few years ago, while I still was more developer than a DBA. I needed to create a database in development environment with identical structure to the one in production. Easy thing, you say, have a "Generate Scripts鈥 wizard do the job. Well, you鈥檒l see.

By the way, I would already omit one crucial detail 鈥 it was 3pm on Friday afternoon.

Since I don鈥檛 have enough rights to do it on my own (SOX forbid!), I go with a registered support request to admin team asking for a favour. The beginning was simple 鈥 Generate Scripts, all objects, new query window, 500 objects, a minute and we鈥檙e done. OK, let鈥檚 run, says he and clicks ‘鈥滶xecute鈥.

OMG, I thought seeing all those red messages in the query pane. Unable to login.

In about half an hour users start calling that they are unable to continue production and a factory is going to stop.

We call a senior admin with all the information we have and he says 鈥渞estore the master鈥. 鈥淲hat?!鈥 we ask. So in about an hour he drops by (he was already home), brings down the server, runs it in a single-user mode and restores master database from warm standby to primary production server. It takes him half an hour, factory doesn鈥檛 stop, the world is saved and we (me and the other guy who clicked Execute) are laughed at for a year. Morals?

  1. Be careful what you wish (script) for 鈥 it turned out that we scripted DROP and CREATE, for unknown reason. Double and triple check.
  2. If you have a script, but you鈥檙e unsure of it鈥檚 behaviour, don鈥檛 ever test it on production.
  3. If you have a job to do and it鈥檚 Friday afternoon, ask yourself if it can wait until Monday morning 鈥 you might have a weekend off.
  4. Have limited trust in the junior guys 鈥 let someone more experienced supervise them a bit.

What is an index – a theory (part 1/4 – dense and sparse indexes)

Since I had written a few posts about indexes before, I thought it would be good to shed some light on theory behind indexes in SQL Server. It would be divided into four parts:

  1. Dense and sparse indexes (this post).
  2. B-trees 鈥 searching for a value and operation efficiency.
  3. B-trees 鈥 inserting an item, operation cost and page splits.
  4. B-trees 鈥 deleting an item, operation cost and page merge.

By no means I am comfortable with saying that it鈥檚 a definitive source of knowledge about indexes in SQL Server. It鈥檚 a result of research conducted by me based on various sources. If you are interested to know more, try reading those books first:

  • Molina, Ullman Database Systems 鈥 The Complete Book, chapter 13
  • Knuth The Art of Computer Programming, chapter 6.2.4
  • Cormen, Leiserson, Rivest, Stein Introduction to Algorithms, chapter 18

Whenever I鈥檓 using online content by other authors, it鈥檚 always used by their consent.

Let鈥檚 start with a definition of an index. An index is a data structure that takes a property of records as input 鈥 typically the value of one or more fields 鈥 and finds the records with that property quickly [Molina, Ullman]. It is equivalent to say that an index allows us to find a record without having to scan all the possible records. The fields upon which an index is based are called search key, or just key.

To get an idea about indexes work, let鈥檚 consider following simplification. A sequential file is a data file for a relation which is sorted by primary key (we are assuming that all relations have a primary key). Note that it has some implications 鈥 all operations (search, insert, delete) have average complexity O(n), where n is a number of records. If a sequential file grows really big (has a lot of records), using it may become a problem.

If records are big, we could devise following method of indexing 鈥 create a file, called index file, in which we keep following pairs for all records from our sequential file 鈥 primary key value and pointer to the record in sequential file. Such index is called dense index, since it contains all values of primary key. Of course, it does not give much of a performance boost, but if the records in a table are big it鈥檚 easier (quicker) to scan such index than a table itself (especially if the index fits in memory). The data retrieval is done by one disk I/O since the index contains pointer to data in data file. The complexity measured in terms of number of records stays the same 鈥 it鈥檚 still O(n) on the average.

If a dense index were too big to give a benefit, we could take a shortcut and have a pointer not to every record, but, say, every fourth record. This is a sparse index, which trades a bit of speed of dense index for smaller size. In our case a sparse index is four times smaller than a dense index, so searching it would follow such procedure (we are searching for primary key value of x):

  1. Find the biggest value y in a sparse index, such as y <= x.
  2. Go to data file and check if primary key value in a record referenced by a pointer from sparse index is equal to x. If yes, we鈥檙e done.
  3. [Loop 4 times] Check the next record if primary key value is equal to x 鈥 if yes, search is complete.
  4. If primary key value of x is not found in the loop, it is not contained in the relation.

Note that step 1 in this procedure is also O(n) on the average, but with a lower coefficient hidden in a big O notation. It is also possible that a partial table scan from step 3 induces additional I/O, so having a sparse index 鈥渢oo sparse鈥 might cause extra I/O.

In order to overcome this, we can create multi-level index structure 鈥 first level being either dense or sparse, whereas next levels have to be sparse indexes (dense index in levels 2+ does not provide any performance boost). The idea is to benefit from smaller record size in index, so the index page contains more records and it requires less I/O. Therefore the procedure above would be changed to (remember that 4 is our choice here, it can be a different number):

  1. Find the biggest value y in a sparse index, such as y <= x.
  2. Go to first-level index, record next 4 records and look up the primary key value of x.
  3. If primary key value of x is not found, it is not contained in the relation.

This kind of structure becomes very similar to B-tree with one difference 鈥 a tree can grow as deep as necessary by itself, but multi-level index has arbitrary depth. Owing to this, the pessimistic example in multi-level index still has O(n) complexity, but the tree lookup will be O(log n). With this comment, it is time to end part 1 of my index theory summary and return in the future with more information about B-trees.

Index maintenance revisited

It鈥檚 almost a month since my post on index maintenance based on fragmentation, in which I attempted to show what can be done using simple script and some DMVs. There was some discussion on LinkedIn group SQLDBA about it and I would refer to it with few comments.

  1. The purpose of my script was purely to demonstrate the existence of sys.dm_db_index_physical_stats DMF, which in conjunction with other DMV yields very interesting results.
  2. The script presented by me was very simple, if not simplistic. It does not work well with partitioned indexes (my own findings), offline databases (as pointed out by Sten Westerback) and 鈥 what鈥檚 more important 鈥 it鈥檚 only criteria of operation is index fragmentation being completely unaware of size or other constraints. This does not make it a good candidate for running unmodified in production environments.
  3. As per another comment 鈥 indeed Microsoft provides very good example in Books Online regarding sys.dm_db_index_physical_stats, which deals with first two of points mentioned above.
  4. Finally, there is a fantastic maintenance tool developed by Ola Hallengren, which gives all kinds of options when it comes to database maintenance. It is successfully used by many DBAs all around world 鈥 if you need something to be working out-of-the-box, you might want to give it a try.

The reason why I鈥檓 writing this is that I don鈥檛 want to be blamed for taking a proof of concept (as my script was) and turning it into a primary maintenance solution. As usual 鈥 read, test, ask questions, adapt, share your findings, we all will benefit from that.

Enumerating SQL Server instances in a network with Power Shell

Ever wondered how many SQL Server instances are there in your network? Or maybe you wanted to check a settings or run a script on all of them? Of course you can use Registered Servers functionality and run a query on all servers in a group, but this can save you the pain of adding all servers one by one.

Since Power Shell builds on .NET framework, you are free to use all the mechanisms available in it. The one we are interested in here is SqlDataSourceEnumerator, which allows you to enumerate all available instances of SQL Server in a local network. The only prerequisite is that servers must have SQL Server Browser service running and the task is completed by the following one-liner:

[System.Data.Sql.SqlDataSourceEnumerator]::Instance.GetDataSources()

The output you get is a table consisting of four columns 鈥 server and instance name, version and information if the server is clustered.

ServerName                                 InstanceName                               IsClustered                                Version                                 
----------                                 ------------                               -----------                                -------                                   
PHOBOS-MOBILNY                                                                        No                                         9.00.4035.00                              

Now take a simple extension of it and you are able to connect to every single SQL Server in your network. It will give you the list in standard SERVER\INSTANCE format. Then you can expand it further using SMO or Power Shell snap-in and you will get all the information you need from all servers at the same time.

[System.Data.Sql.SqlDataSourceEnumerator]::Instance.GetDataSources()  | % {if ($_.InstanceName -eq &quot;&quot;.ToString) { $_.ServerName + &quot;\&quot; + $_.InstanceName} else { $_.ServerName }}

I鈥檓 awaiting – as always 鈥 your feedback, what鈥檚 good and what still needs to be improved. Or maybe there鈥檚 just something you would like to share.

Some common problems with users and logins and their solution

After few posts devoted to indexes it鈥檚 time to look at something else. Let鈥檚 see some common problems with users and logins, and methods of overcoming them. These are the basic and most frequently occuring ones.

  1. Unable to connect to server – cannot open user default database (error 4064).
    image

    The message appears if the default database for a login does not exist or it鈥檚 unavailable. An example of former would be a migration, when you move some databases and logins between servers. Latter is a little bit more tricky 鈥 imagine a server which participates in database mirroring; the mirror database is in recovery all the time, so any attempts of logging into it would be rejected.
    The solution is easy 鈥 set the default database for the login to an existing one, which always available. Good practice is to use master for this purpose 鈥 master always exists and all users can connect to it. You can use SSMS for it or 鈥 if you prefer typing:

    ALTER LOGIN [testLogin] WITH DEFAULT_DATABASE=[master]
  2. Unable to map login to user in a database – user, group or role already exists in the current database (error 15023).

    image

    This message can also appear during a migration, if you copy (or move) your database (using Copy Database Wizard or detach/attach) but forget to create respective logins first, rather create logins after the database is brought online, and then try to map them to database users.

    The solution is also simple 鈥 but only T-SQL this time. You have to execute it in the database context and it works only for SQL logins (see Books Online for details of sp_change_users_login):

    sp_change_users_login 'update_one', login_name, user_name

    There is also a note in mentioned BOL article saying that from SQL Server 2008 sp_change_users_login stored procedure is considered to be removed in the future. Instead, it is recommended to use ALTER USER statement:

    ALTER USER user_name TO LOGIN = login_name
  3. Orphaned database (no owner set).

    I can鈥檛 think of any reasonable explanation for that, but since I have experienced that recently, it鈥檚 worth mentioning. You restore a database from backup and it turns out that user dbo is mapped to a login that does not exist. You can do it from SSMS (Database Properties 鈥> Files 鈥> input database owner) or using T-SQL:

    sp_changedbowner login_name

    If you look for sp_changedbowner in the Books Online, you’ll notice that it is marked for deprecation as well. The suggested solution is to use ALTER AUTHORIZATION insted:

    ALTER AUTHORIZATION ON DATABASE::DBOUserTest to [login_name]

A few words about required privileges to complete these actions. ALTER LOGIN requires ALTER ANY LOGIN permission, but a principal can change it for its own login (this may fail if tried from SSMS since it sets CHECK_EXPIRATION and CHECK_POLICY which require CONTROL SERVER permission, at least it鈥檚 what I observed on SQL Server 2005). To remap a user with ALTER USER, you require CONTROL permission in a database. sp_change_users_login requires membership in db_owner database role. Finally, ALTER AUTHORIZATION requires TAKE OWNERSHIP on the database (on the object, in general) and might require IMPERSONATE, if you鈥檙e not logged with login taking ownership. sp_changedbowner might need CONTROL SERVER in addition to that, if new owner doesn鈥檛 have a user in a database. Since a DBA probably is a sysadmin, this is not a problem, but I have seen users in live databases yielding CONTROL which might make us wonder 鈥渄o they really need it?鈥.

Well, this would be all about users and logins for now. If you have questions, suggestions or corrections, feel free to comment and ask. I鈥檒l be happy to respond.

How to find potential missing indexes in a database

In previous posts (here and here), I described the methods I use to monitor and maintain indexes. This is a next chapter 鈥 let鈥檚 try to find candidates for new indexes or hints for extending existing ones. Of course, there are few methods to achieve this goal:

  1. Find a candidate query and analyze it in Management Studio (applies to SQL Server 2008 or newer). This reduces to another problem 鈥 how to find a suitable query? You can use Profiler, but you have to spend some time working on filtering plus you can miss the query, if you鈥檙e not patient enough. You can also use reports in Management Studio 鈥 right click on server, select Reports 鈥> Standard Reports and choose Peformance 鈥 Top Queries by Average IO and Top Queries by Total IO. Queries in Top Queries in Average IO are the best candidates for optimizing, especially if they are frequently used.
    As for analysis 鈥 you can do it yourself manually or take advantage of new SSMS feature in 2008 鈥 if you display execution plan, it will mention an index which could be helpful and projected benefit if such index is created.
  2. If you have a query or a bunch of them, you can provide them as input to Database Engine Tuning Advisor and get the recommendation.
  3. You can also check sys.dm_db_missing_index_details DMV to get information of all possible missing indexes in a database. This is actually the method used in SSMS (mentioned in step 1) 鈥 if it stumbles upon a scan or a seek which would benefit from additional index, it updates stats in the DMV and returns information to the user. If you want to find all indexes you can use following script:
    select 
    	mid.statement as [Table name], 
    	mid.equality_columns, 
    	mid.inequality_columns, 
    	mid.included_columns, 
    	isnull(migs.unique_compiles, 0) as [Compile count], 
    	isnull(migs.user_seeks, 0) as [User seeks], 
    	isnull(migs.user_scans, 0) as [User scans], 
    	migs.avg_total_user_cost, 
    	migs.avg_user_impact
    from 
    	sys.dm_db_missing_index_details mid
    	inner join sys.dm_db_missing_index_groups mig on mid.index_handle = mig.index_handle
    	left join sys.dm_db_missing_index_group_stats migs on mig.index_group_handle = migs.group_handle

Compare it also with Books On-Line entries for sys.dm_db_missing_index_columns dynamic management function (there is a similar example to this one in this article) and sys.dm_db_missing_index_details dynamic management view for more information. And 鈥 as usual 鈥 please let me know if you have any comments or suggestions.