Monthly Archives: March 2018

How to assess a SQL Server instance for GDPR compliance by writing every table and column to Excel

This post uses the script I had written before here LINK modified slightly to include a count of each table. If you are a DBA you’re likely assisting compliance offers to find personal identifiable data within the databases at this time. This script will allow you to provide them with a record of every database, table and column on an entire instance. The compliance offer can then sieve through all the columns and highlight any columns that look like they contain personal data for further investigation.

Below is a SQL query that will return the following metadata about each table from each database located on a SQL server database server:

  • ServerName
  • DatabaseName
  • SchemaName
  • TableName
  • CountOfRows
  • ColumnName
  • KeyType

The output is provided in a table format with these additional formatted lines of text which can be used as queries or as part of queries.

  • A Select table query
  • A Count table row columns
  • A Select column query
  • Each column bracketed
  • Each table and column bracketed

The table returned by the query can be exported to excel. Using excels filter option applied to the columns of the table makes finding and selecting specific tables and columns very easy.

This process can be repeated for every SQL Server instance used by the business to generate a single mapped servers master excel file allowing the user to find any table or column available to the organization quickly.

Applying some colour coding like below adds to the ease of use.

Image of excel file with mapped database server structure

How to use:

Simply open SQL Server Management Studio and from object explorer right click on the server name and select new query. This will open a window set to the master database of the server. Copy and paste the SQL below into this SQL Server window and execute. When the query is finished you will have created the table above.

/*
SCRIPT UPDATED
20180316
*/

USE [master]
GO

/*DROP TEMP TABLES IF THEY EXIST*/
IF OBJECT_ID('tempdb..#DatabaseList') IS NOT NULL
	DROP TABLE #DatabaseList;

IF OBJECT_ID('tempdb..#TableStructure') IS NOT NULL
	DROP TABLE #TableStructure;

IF OBJECT_ID('tempdb..#ErrorTable') IS NOT NULL
	DROP TABLE #ErrorTable;

IF OBJECT_ID('tempdb..#MappedServer') IS NOT NULL
	DROP TABLE #MappedServer;
	
IF OBJECT_ID('tempdb..#TableCount') IS NOT NULL
	DROP TABLE #TableCount;
	
IF OBJECT_ID('tempdb..#Count') IS NOT NULL
	DROP TABLE #Count;

DECLARE @ServerName AS SYSNAME
DECLARE @Count INT

SET @ServerName = @@SERVERNAME

CREATE TABLE #DatabaseList (
	Id INT NOT NULL IDENTITY(1, 1) PRIMARY KEY
	,ServerName SYSNAME
	,DbName SYSNAME
	);

CREATE TABLE [#TableStructure] (
	[DbName] SYSNAME
	,[SchemaName] SYSNAME
	,[TableName] SYSNAME
	,[ColumnName] SYSNAME
	,[KeyType] CHAR(7)
	) ON [PRIMARY];
	
CREATE TABLE [#TableCount] (
	[Id_TableCount] INT NOT NULL IDENTITY(1, 1) PRIMARY KEY
	,[ServerName] SYSNAME
	,[DatabaseName] SYSNAME
	,[SchemaName] SYSNAME
	,[TableName] SYSNAME
	,[SQLCommand] VARCHAR(MAX)
	,[TableCount] INT
	);

CREATE TABLE #Count (ReturnedCount INT);

/*THE ERROR TABLE WILL STORE THE DYNAMIC SQL THAT DID NOT WORK*/
CREATE TABLE [#ErrorTable] ([SqlCommand] VARCHAR(MAX)) ON [PRIMARY];

/*
A LIST OF DISTINCT DATABASE NAMES IS CREATED
THESE TWO COLUMNS ARE STORED IN THE #DatabaseList TEMP TABLE
THIS TABLE IS USED IN A FOR LOOP TO GET EACH DATABASE NAME
*/
INSERT INTO #DatabaseList (
	ServerName
	,DbName
	)
SELECT @ServerName
	,NAME AS DbName
FROM master.dbo.sysdatabases WITH (NOLOCK)
WHERE NAME <> 'tempdb'
ORDER BY NAME ASC

/*VARIABLES ARE DECLARED FOR USE IN THE FOLLOWING FOR LOOP*/
DECLARE @sqlCommand AS VARCHAR(MAX)
DECLARE @DbName AS SYSNAME
DECLARE @i AS INT
DECLARE @z AS INT

SET @i = 1
SET @z = (
		SELECT COUNT(*) + 1
		FROM #DatabaseList
		)

/*WHILE 1 IS LESS THAN THE NUMBER OF DATABASE NAMES IN #DatabaseList*/
WHILE @i < @z
BEGIN
	/*GET NEW DATABASE NAME*/
	SET @DbName = (
			SELECT [DbName]
			FROM #DatabaseList
			WHERE Id = @i
			)
	/*CREATE DYNAMIC SQL TO GET EACH TABLE NAME AND COLUMN NAME FROM EACH DATABASE*/
	SET @sqlCommand = 'USE [' + @DbName + '];' + '

INSERT INTO [#TableStructure]
SELECT DISTINCT ' + '''' + @DbName + '''' + ' AS DbName
	,SCHEMA_NAME(SCHEMA_ID) AS SchemaName
	,T.NAME AS TableName	
	,C.NAME AS ColumnName
	,CASE 
		WHEN OBJECTPROPERTY(OBJECT_ID(iskcu.CONSTRAINT_NAME), ''IsPrimaryKey'') = 1 
			THEN ''Primary'' 
		WHEN OBJECTPROPERTY(OBJECT_ID(iskcu.CONSTRAINT_NAME), ''IsForeignKey'') = 1 
			THEN ''Foreign''
		ELSE NULL 
		END AS ''KeyType''
FROM SYS.TABLES AS t WITH (NOLOCK)
INNER JOIN SYS.COLUMNS C ON T.OBJECT_ID = C.OBJECT_ID
LEFT JOIN INFORMATION_SCHEMA.KEY_COLUMN_USAGE AS iskcu WITH (NOLOCK) 
ON SCHEMA_NAME(SCHEMA_ID) = iskcu.TABLE_SCHEMA 
	AND T.NAME = iskcu.TABLE_NAME
	AND C.NAME = iskcu.COLUMN_NAME
ORDER BY SchemaName ASC
	,TableName ASC
	,ColumnName ASC;
';

	/*ERROR HANDLING*/
	BEGIN TRY
		EXEC (@sqlCommand)
	END TRY

	BEGIN CATCH
		INSERT INTO #ErrorTable
		SELECT (@sqlCommand)
	END CATCH

	SET @i = @i + 1
END

/*
JOIN THE TEMP TABLES TOGETHER TO CREATE A MAPPED STRUCTURE OF THE SERVER
ADDITIONAL FIELDS ARE ADDED TO MAKE SELECTING TABLES AND FIELDS EASIER
*/
SELECT DISTINCT @@SERVERNAME AS ServerName
	,DL.DbName
	,TS.SchemaName
	,TS.TableName
	,TS.ColumnName
	,TS.[KeyType]
	,',' + QUOTENAME(TS.ColumnName) AS BracketedColumn
	,',' + QUOTENAME(TS.TableName) + '.' + QUOTENAME(TS.ColumnName) AS BracketedTableAndColumn
	,'SELECT * FROM ' + QUOTENAME(DL.DbName) + '.' + QUOTENAME(TS.SchemaName) + '.' + QUOTENAME(TS.TableName) + '--WHERE --GROUP BY --HAVING --ORDER BY' AS [SelectTable]
	,'SELECT ' + QUOTENAME(TS.TableName) + '.' + QUOTENAME(TS.ColumnName) + ' FROM ' + QUOTENAME(DL.DbName) + '.' + QUOTENAME(TS.SchemaName) + '.' + QUOTENAME(TS.TableName) + '--WHERE --GROUP BY --HAVING --ORDER BY' AS [SelectColumn]
	,'SELECT COUNT(*) FROM ' + QUOTENAME(DL.DbName) + '.' + QUOTENAME(TS.SchemaName) + '.' + QUOTENAME(TS.TableName) + 'WITH (NOLOCK)' AS [PerformTableCount]
INTO #MappedServer
FROM [#DatabaseList] AS DL
INNER JOIN [#TableStructure] AS TS ON DL.DbName = TS.DbName
ORDER BY DL.DbName ASC
	,TS.SchemaName ASC
	,TS.TableName ASC
	,TS.ColumnName ASC

/*
HOUSE KEEPING
*/
IF OBJECT_ID('tempdb..#DatabaseList') IS NOT NULL
	DROP TABLE #DatabaseList;

IF OBJECT_ID('tempdb..#TableStructure') IS NOT NULL
	DROP TABLE #TableStructure;
	
INSERT INTO #TableCount (
	[ServerName]
	,[DatabaseName]
	,[SchemaName]
	,[TableName]
	,[SQLCommand]
	)
SELECT DISTINCT [ServerName]
	,[DbName]
	,[SchemaName]
	,[TableName]
	,[PerformTableCount]
FROM #MappedServer
ORDER BY [ServerName] ASC
	,[DbName] ASC
	,[SchemaName] ASC
	,[TableName] ASC

SET @i = 1
SET @z = (
		SELECT COUNT(*) + 1
		FROM #TableCount
		)
	
WHILE @i < @z
BEGIN
	SET @SQLCommand = (
			SELECT SQLCommand
			FROM #TableCount
			WHERE Id_TableCount = @i
			)

	--ERROR HANDLING
	BEGIN TRY
		INSERT INTO #Count
		EXEC (@SqlCommand)
	END TRY

	BEGIN CATCH
		INSERT INTO #ErrorTable
		SELECT (@sqlCommand)
	END CATCH

	SET @Count = (
			SELECT ReturnedCount
			FROM #Count
			)

	TRUNCATE TABLE #Count

	UPDATE #TableCount
	SET TableCount = @Count
	WHERE Id_TableCount = @i;

	SET @i = @i + 1
END

SELECT *
FROM #ErrorTable;

IF OBJECT_ID('tempdb..#ErrorTable') IS NOT NULL
	DROP TABLE #ErrorTable;

/*
THE DATA RETURNED CAN NOW BE EXPORTED TO EXCEL
USING A FILTERED SEARCH WILL NOW MAKE FINDING FIELDS A VERY EASY PROCESS
*/
SELECT DISTINCT ms.ServerName
	,ms.DbName
	,ms.SchemaName
	,ms.TableName
	,ts.TableCount AS CountOfRows
	,ms.ColumnName
	,ms.KeyType
	,ms.BracketedColumn
	,ms.BracketedTableAndColumn
	,ms.SelectColumn
	,ms.SelectTable
FROM #MappedServer AS ms
LEFT JOIN #TableCount AS ts ON ms.ServerName = ts.ServerName
AND ms.DbName = ts.DatabaseName
AND ms.SchemaName = ts.SchemaName
AND ms.TableName = ts.TableName
ORDER BY ms.DbName ASC
	,ms.SchemaName ASC
	,ms.TableName ASC
	,ms.ColumnName ASC;
	
IF OBJECT_ID('tempdb..#MappedServer') IS NOT NULL
	DROP TABLE #MappedServer;

IF OBJECT_ID('tempdb..#TableCount') IS NOT NULL
	DROP TABLE #TableCount;
	
IF OBJECT_ID('tempdb..#Count') IS NOT NULL
	DROP TABLE #Count;

 

How to determine where a new database should be deployed

Below I’ve listed 22 questions to consider when deciding where a new SQL Server database should be deployed given the four main options of:

  1. Add Database to existing Instance
  2. Create separate Instance for Database
  3. Create separate VM for Instance and Database
  4. New physical machine needed

There’s no hard and fast rule for determining where a database should go however there are often a lot of limitations dictating your choices. I’ve started the list with a big limitation, money.

If you want to use the list like a score card I’ve prepared one called “New Database Hosting Environment Assessment Form”, I know not the snappiest of titles.

New Database Hosting Environment Assessment Form

Use: I’d recommend stepping through the questions with the requester while gathering feedback. You can then fill in the option scoring with weighted figures or simply use the boxes to record notes. The main deliverable of the form is you will have some documentation that demonstrates the location for the database was assessed and the logic and reasoning for the location chosen has been recorded.

Of course consider adding additional questions relevant to your case to the form. If you think your additional questions would be relevant to everyone deploying a new database free to forward your questions on to me and I’ll include them in the form.

The form can be downloaded here link.

Question 1: Can the business afford a new physical machine?

Considerations: If the financial resources are not available option four
can be eliminated.

Question 2: Are there OS or SQL Server Licensing restrictions?

Considerations: If there are no restrictions to a new VM with it’s own instance this will likely offer the most scalable solution and produce the best performance. If restrictions exist options are limited to 1 & 2.

Question 3: Is the Database the back end for a stand-alone application?

Considerations: The back end for a stand-alone application should be isolated as much as possible from other impacting factors. The higher the number of the option chosen the better.

Question 4: What is the primary purpose of the Database?

Considerations: What is the business use\s and in what capacity will the database function i.e. Transactional Db, Reporting Db, Datastore?

Question 5: Do you want\need to be able to limit access to hardware resources?

Considerations: Access to resources, ram etc. can only be limited at an instance level. Option 1 cannot be used.

Question 6: Are there any SQL Server Service Pack dependencies?

Considerations: SQL Server can have different SPs on different instances but cannot have different SPs for different databases within an instance.

Question 7: What is the current excess capacity of the hardware?

Considerations: Can the Hardware handle the additional workload? If not either reduce resource consumption of resources on the hardware, add RAM and/or cores or choose option 4.

Question 8: Is there a VM capacity restraint on the hardware?

Considerations: Can the hardware actually host a new VM without a trade off in resources or a decrease in performance? If restrained option 3 cannot be used.

Question 9: What is the VM capacity?

Considerations: Is the OS already maxed out on resources? Can the VM handle the additional workload?

Question 10: Is there an expected increase in size and usage in the Database over time?

Considerations: If known and minimal option 1 can be used. If unknown or unlimited the higher the number of the option chosen the better.

Question 11: Is the resource usage of the new Database known?

Considerations: Benchmarking RAM, CPU and bandwidth usage should be carried out prior to installation.

Question 12: What are the disaster recovery requirements?

Considerations: Should the databases that share common dependencies be hosted on
the same server?

Question 13: What is the required operational up time? 24/7 etc.

Considerations: Does this operational run time tie in with the rest of the databases
on the instance, or the instances hosted on a VM or physical server?

Question 14: What are the Maintenance requirements?

Considerations: Will new index rebuilds etc. take a long time and effect the schedule of the instance overall?

Question 15: What are the Backup requirements?

Considerations: Will the backup schedule of the other databases be impacted
by the addition?

Question 16: Is the Database functionally similar or supporting databases currently on the instance?

Considerations: Does the new Database logically fit in with the database/s currently running on an instance?

Question 17: Have server security concerns been addressed?

Considerations: Will people with administrative access to the server have access to a Database and data they should not be able to see?

Question 18: Does hosting particular databases together compound the potential
damage If an unauthorised person was to gain access to the server?

Considerations: Will an unauthorised person have access to data that represents a major security breach if the data is combined from different databases?

Question 19: Does a vendor have control over the instance or server?

Considerations: Will putting a Database on a particular instance leave you at the mercy of a vendor?

Question 20: Will stacking instances make tracking down performance issues across
the VM difficult?

Considerations: Will this create a server level noisy neighbour problem?

Question 21: Will packing databases too densely make tracking down performance issues across the Instance difficult?

Considerations: Will this create an instance level noisy neighbour problem?

Question 22: Will moving the Database to be hosted somewhere else in the
future be possible?

Considerations: Does this decision need to be gotten right the first time?