Documenting My Home Network/Lab

I’m a big believer in learning by doing. Over the last couple years I’ve built up a nice little home network/lab where I host various applications and test new tech. I will hardly be the first person to make this recommendation, but if you are able to get your hands on some hardware to set up a home lab I would highly recommend it. Any old computer you can turn into an ESXi host is enough to get started. I have learned an enormous amount about various technologies that I wouldn’t have had much opportunity to learn elsewhere. First, the obligatory picture:

The breakdown:

  • Ubiquiti Edgerouter X
  • Ubiquiti UniFi AP AC LR (not pictured as it’s located in a location central to the house)
  • ARRIS SURFboard DOCSIS 3.0 Cable Modem (SB6183)
  • Cisco SG300-28
  • Cisco SF302-08MPP
  • TRENDnet 24-Port Cat6 Patch Panel (22 runs in the house, minimum for 2 per room)
  • Startech 12U 19-Inch Desktop Open Frame 2 Post Rack
  • Raspberry Pi 3 Model B
  • Lenovo TS140
    • Xeon E3-1226 v3 3.3Ghz
    • 16 GB Ram
    • 44 TB (raw) of various spinning storage, mostly WD reds (yeah, yeah, I know. I’ll get a NAS eventually)
  • Old computer serving as my ESXi host
    • Intel Core i7-2600K Sandy Bridge Quad-Core 3.4GHz
    • 16 GB RAM
    • Samsung 840 EVO 250GB SSD
    • 5 TB (raw) of WD Reds
  • 2x CyberPower CP1500AVRLCD UPS
    • One is for my “core” equipment (network stuff/TS140), the other is on the ESXi host

I am a huge fan of the Ubiquiti hardware. This combo was one of the first purchases and its pretty fantastic to not have to regularly “reset the router” like seems to be necessary for just about every consumer router/AP. I picked up both switches off of eBay and have been pleased with them. Some Cisco purists dislike the small business line but they’ve been great for my purposes. The SG300-28 (gigabit) is my main switch and the SF302-08MPP (fast ethernet, POE) runs my cameras. I didn’t need to have separate switches, but based on what was available on eBay this was perfect for my goals. If we’re being honest, need went out the window a long time ago anyway. I run a handful of VLANs and 2 separate physical LANs to segment off the different portions of my network. Necessary? Probably not, but where’s the fun in that? This has allowed me to tinker with VLANing and setting up firewall rules. Fun stuff!

The TS140 runs AD, DNS, Blue Iris (cameras), and acts as a file server. My storage situation could certainly be improved, but all important data is backed up locally and remotely and any data that I consider to be less important is at least mirrored to separate local storage. My Raspberry Pi runs PiVPN which is a super simple way to run OpenVPN and is much more manageable than my previous setup where I ran it on my router. The ESXi host runs the following VMs and applications:

  • Plex and related apps
  • Borg Backup
  • Crashplan
  • Nextcloud
  • Kali box for security tinkering
  • Windows VM for when I need to run Windows specific apps
  • LibreNMS
  • NGINX
  • Whatever I feel like playing with at the moment

Each of my Linux boxes are running Ubuntu 16.04. All VMs are regularly backed up using ghettoVCB.

That’s pretty much it for now. I will update when I inevitably add more!

Happy 2015

New year, new updates. After a brief stint as a DBA at another organization I have returned to my previous employer to take on a new and exciting role as a Database Administrator in a large SaaS environment. This environment has customers and data centers all over the wold and will certainly present some new challenges. It’s no secret that IT as a whole is trending toward cloud architecture and I’m excited to be right in the thick of it.

I’m also excited to announce that I will be presenting my session “Troubleshooting SQL Server Performance Using Wait Stats” at the Ohio North SQL Server User Group on January 6 and then again at SQLSaturday #357 – Cleveland on February 7. These events will be my first speaking engagements and I am definitely looking forward to them.

I have again become quiet on this blog as the last few months have seen two career moves that have kept me busy. Now that I have started to settle in I hope to increase my posting frequency. My new role should provide me with plenty to share!

The WAIT_AT_LOW_PRIORITY Option

My first post on this blog detailed a scenario where a read uncommitted select statement could ultimately block an insert/update/delete. In this scenario, a long running read uncommitted select is executed requiring a schema stability lock. That lock prevented the online rebuild from grabbing the schema modification lock necessary and caused the update statement to get queued up behind it.

SQL Server 2014 introduced an option that will allow more control over how online rebuilds behave in a scenario such as the one I described. The WAIT_AT_LOW_PRIORITY option gives you a few different choices in dealing with blocking scenarios. The syntax, from Books Online, is below.

MAX_DURATION is the time, in minutes, the rebuild will wait to acquire the necessary locks before taking action. ABORT_AFTER_WAIT tells it what to do after that time period has passed. Setting it to NONE means SQL Server will just continue to wait while setting it to SELF will cause it to give up on the rebuild. If this rebuild absolutely must finish, this could be set to BLOCKERS which would kill the process or processes preventing it from completing. To see this in action, lets first create our old scenario without the WAIT_AT_LOW_PRIORITY option.

LowPriority1

Now let’s try the same thing except we will use the WAIT_AT_LOW_PRIORITY option with a MAX_DURATION of 1 and ABORT_AFTER_WAIT set to SELF. This means after a minute of waiting the rebuild will give up.

LockPriority2

As we can see, after a minute of waiting the rebuild gave up and we would not have been able to recreate our previous scenario.

Sort order without an ORDER BY

If you’ve worked with SQL Server for any length of time, you know that sort order is never guaranteed without an ORDER BY. I was recently asked why a query (without an ORDER BY) brings back results in alphabetical order in production while in test it returns them in the order they were inserted to the table. To explain this, I showed a quick example that I thought I would share.

First, let’s create a table and populate it with some data. We will be creating a clustered index on an identity column and a nonclustered index on the lname column. I then used generatedata.com to create 100 dummy rows to insert.

Now let’s go ahead and select all rows from the table where the last name begins with the letter L and grab the execution plan.

resultorder_nohint
resultorder_ex_nohint

Looking at the identity column, we can see here it brought back the results in the order they were inserted to the table. With the table being so small, it was most efficient to scan the clustered index. Since we have a clustered index on the identity column the rows will be stored on page in that order. Now let’s hint the same query to use the nonclustered index on lname.

resultorder_withhint

resultorder_ex_withhint

Here we can see that it returned the results in order by lname which is the order the lname column and clustering key would be stored at the leaf level of our nonclustered index. This ended up being what was happening in the environment I was asked about. The table in their production system was much larger and was correctly choosing to use a nonclustered index. This is just a simple example of how sort order is never guaranteed without an ORDER BY.

All About Collation

Collation is one of the easiest settings to overlook when configuring SQL Server but it’s one that can come back to bite you if it isn’t properly set. In this post I intend to cover the basics of collations, common issues resulting from collation misconfiguration, and basic resolution of those common issues.

So what is a collation? One way to think of a collation is as a dialect of a language. For example, in the U.S. we use the word color. Other English speaking countries use the word colour instead. Who is right? Well, that would depend entirely on where you are from. If you are in the U.S., color is probably going to be considered correct. If you had never even seen the word colour, would you know it means the same thing as color?

Collations can be thought of in a similar way. There are many variations of a language, each with different rules associated. These rules are set by the collation and govern things like sorting, case sensitivity, and accent sensitivity. In one collation, ‘ABC’ may be the same thing as ‘abc’ while in others that may not be the case. The “language” associated with a collation is going to be a code page.

Single byte (8 bit) characters (char, varchar, text) can make a total of 256 different combinations with their bits being flipped on and off. Since there are more than 256 characters in the entire world, there is no way to represent every possible character in a single byte. This is where code pages come in. Code pages exist at the Windows level and keep track of which characters the 256 combinations represent.

For example, English speaking countries are going to use the Latin code page (1252). The 256 characters on that code page are going to be characters used in the English language. We wouldn’t be able to fit every Greek character in that same code page, so Greek has it’s own code page as well (1253). All collations, their description, and their code page can be found with the query below.

Unicode data types (nchar, nvarchar, ntext) are stored double byte so they can actually store multiple languages without having to reference a code page. Our Unicode data types will still play by the sorting and comparison rules of the collation, so you still don’t get to ignore collation if you are using Unicode data types.

Collation is set at the instance level, the database level, and the column level. When a new database is created it will use the instance collation unless otherwise specified. Your system databases will also match that instance collation. Any character columns created in your user databases will use the database collation when they are created unless a collation is defined in the create statement.

When problems with collation occur it’s typically because something was overlooked in the planning stages. At some point somebody realizes the wrong collation was used and they decide to change it at the database level. Problem solved right? Unfortunately that is not the case as the database collation is simply what SQL Server looks to when new objects are created. Changing this setting will have no impact on existing objects. So in this scenario, the change is made and time goes on and new tables are added to the database using this new collation. What happens at this point is that you have two collations actively used in your database and if there is ever a need to compare columns in different collations you will encounter errors.

The other issues commonly seen are TempDB collation mismatches. We know that TempDB’s default collation is going to match the instance collation so any objects (such as temp tables) created in TempDB will use that collation by default. So what happens if your user database uses a different collation and you have to compare it to data in a temp table created with TempDB’s default collation? You guessed it, you get collation mismatch errors.

The proper fix for these issues really depends on the scenario. With collation mismatches within a user database you can simply alter the columns if there are only a few tables impacted. Unfortunately these issues, in my experience, are rarely caught that early. I have worked with databases that have had multiple collations throughout the database and manually altering them was more work than it was worth. In this scenario I would just migrate all the data to a new database that was created in the correct collation.

Your other option if there are only a few tables is to collate the columns in question within the query. This is done by adding a COLLATE clause to the column. Unless this is a report or some other ad-hoc query, this probably isn’t going to be a very viable option.

With TempDB collation mismatches the fix is somewhat similar. You can take the sledgehammer approach and either change the collation of the database and all existing objects to match the instance (and therefore TempDB) or change the instance collation. Neither is an easy task and can introduce even more issues if done incorrectly. Alternatively, you could modify the code like in the other scenario. This could be accomplished by specifying a collation when creating the temporary object, or modifying the queries to collate the columns on the fly.

Not scared yet? Well the above are best case scenarios assuming that the collations in question share the same code page. When they are using different code pages a potential for data loss exists when changing between collations, and you probably won’t know when it happens. Your best bet is to avoid these issues altogether by making sure the proper collation is used in the first place. As a general rule, your instance and all databases on that instance should use the same collation. This is the best way to be sure that you will never encounter issues within your user databases or in TempDB.

Let’s create a test database and step through a few examples.

Now let’s take a look at the collation at the instance, object, and table levels.

CollationInfo

We can see that our instance collation is SQL_Latin1_General_CP1_CI_AS. As we learned, our database collation will match that along with our columns since we did not specify otherwise. Now let’s change the database collation of our database and create a new table with that collation. If we select from the two tables, we will see that the data looks identical.

CollationLooksSame

Both sets of data look identical, but watch what happens when we try to join them together.

CollationError

And now we have the dreaded “cannot resolve the collation conflict” error. This is why collation mismatches can be a silent killer. You generally don’t know it’s in issue until it jumps out at you. As we discussed earlier, there are two ways to fix this. Alter the collation of the columns themselves, or modify the query. First we will alter the columns with success.

JoinSuccess

The other option is to use the COLLATE clause within our query. I have set our columns back to the wrong collation for this example.

JoinSuccess

Let’s now look at TempDB collation mismatches. As it stands currently, our person1 table is using the SQL_Latin1_General_CP1_CI_AS collation and person2 is using Latin1_General_100_CI_AI. Here we will create and populate a temp table and then try to join to it.

CollationError

Once again, our join fails because our temp table was created using the default collation of TempDB. If we were to include a collation in our create table statement, this would work without issue. Alternatively we could modify our select query to use the COLLATE clause like in the previous example.

JoinSuccess

Unfortunately, most real world examples I have encountered aren’t always this simple to resolve. If thousands of objects are using multiple collations the solution becomes a little less clear. It becomes even more difficult when different code pages are in play as data loss is a possibility. With proper planning, however, these situations can be avoided entirely.