Cloud cloud cloud.. its everywhere, its been everywhere for a while, if I had a pound for every time someone told me “it’s the future” I would be on a beach instead of writing this review.. More fool me perhaps!
Because of all the pressure to keep up with the Joneses we haven’t ignored the progression of Cloud neither, indeed, over the past couple of years we have done a lot of testing on various platforms to understand where it could fit in to our IT Infrastructure. Indeed, we have even migrated some of our End-Of-Life kit on to Cloud solutions instead of buying new hardware. This article is going to get a bit “Techy” in places, but I will try to keep it as simple as I can and I hope you find it interesting nonetheless.
We haven’t taken the jump yet fully and we’re not so sure we will – well not in the nearest future in any case. To put a little flesh to the bone on this statement I will include a number of graphs and charts to show you how, in particular, Microsoft Azure sized up when we ran real world tests against it vs our Test Database(DEV2) (pretty simple single Dual Core Xeon, 4U server with about 4GB of ram, from circa 2008) and a more beefy Backup Machine(SQL4) (2TB Fibre Attached San, with eighty or so 72GB disks in RAID10, 4-way Dual Core Xeons and 32GB of RAM).
Day to day we use SQL 2005 Enterprise Edition for our core database, with a redundant backup machine(SQL4). We also use various flavours of MySQL and PostgreSQL but we’re not going to be comparing those here, we’ll just stick to the MSSQL for now. Of course, we know there are other options out there for hosted MSSQL, not least Amazon EC2 and the likes, but we’re going to stick to core MS products and services for benefit of this article, comparing our in house solutions to Amazon etc. is for another article!
How the tests were performed.
A sample database was prepared based on data stored in the live WAYN database. Due to the then Azure database size restriction of 10GB (now 50GB), the test database contained data only for 2M migrated members.
There were four tests performed, each test was executed 50 times on each database.
Test 1 &2: CPU and I/O reads
The first two tests were focused on CPU processing power and I/O reads. Based on the logs stored in the database, the 150 worst (in terms of duration) cases of a very complex Search procedure execution for migrated members were selected. The results were asymmetrically divided into two tests:
• Test 1 containing 50 executions of aforementioned procedure
• Test 2 containing 100 executions
The procedure is a rather complicated piece of art boasting over 1200 lines of code using temporary tables, dynamic queries and multiple SELECT commands to collect the most relevant set of “matching” members for a given individual. The procedure was slightly modified, firstly to return no recordset (however, there was a simple count(*) performed on result to make sure it is calculated) and secondly to be compliant with Azure SQL requirements and limitations. Due to the complexity of the procedure, results are not being cached which effects in a similar execution time of subsequent executions even for the same primary key.
Test 3: I/O writes – Inserts
The third test was a set of simple INSERT INTO … SELECT * simulating to some extent bulk import behaviour – each execution inserted between 20 and 60 rows. A single test execution performed 500 INSERTs, resulting in 14123 rows added to a table. Before each execution, the destination table was truncated. The destination table had the identical structure to a source table, including all indexes, constraints and primary keys.
Test 4: I/O writes – Updates
The last test was testing a set of four UPDATEs, two of which could use an index to select rows for updating. The main objective of this test was to check the performance of changing data in place – updating a field and then using the updated values in a new query.
Test execution and processing results
Each test was executed 50 times using the SQLCMD command with the packet size option set to 16384. The tests were performed on three different databases: DEV2 (local development machine, SQL 2005 SP3), SQL4 (live environment Backup Database SQL 2005 SP3) and Azure. Additionally, some of the tests were performed on an evaluation version of SQL 2008 R2. The time required to execute each test as reported by SQLCMD was used as a test metric. The results were processed using a PERL script and copied into Excel, where they were normalized and the Azure results were chosen as a reference point.
The first two tests show there is definitely potential in Azure. The queries run significantly faster in Azure, than they do in both the backup and test environment. One must bear in mind that results of the backup environment were affected by restoring logs – obviously the website could not be put in jeopardy just for the sake of running tests and we need to run other checks on the backup database from time to time to guarantee 100% accuracy.
The last two tests exposed the fact that Azure is clearly slower with saving and modifying data, than physical machines. For most users it might not be a huge drawback, as usually the vast majority of database operation are reads, however in our case with extensive usage of Memcached and local caching, that might turn out to be a potential problem – especially for real time processing. Also, please note how fast SQL 2008 R2 is at loading data – if you happen to load data on a regular basis, upgrading might be a worthwhile investment!
So far Azure seems to be a good candidate for potential offline processing, storing logs, statistical counters and all other data which does not need to be updated and queried in real time. Running the website with SQL Azure is not really an option at this time for us, not only because of current limitations on database size, but also because of latency – the slowdown of 80-100ms on each query could be quite difficult to live with. This being said, we can see a lot of potential in Azure and once the platform has matured a bit we suspect it will be a really good alternative to hosting your own database infrastructure, with the inherent costs and problems that brings.
Based on this, our opinion from our perspective, we will hold of on SQL Azure for a little while, perhaps in one or two more releases MS will remove the database size restrictions and improve some of the other missing elements, like the ability to do SQL backups (we do note that Azure is constantly progressing, one of their latest improvements to Azure is a Database Copy functionality). Until then, we will be keeping a close eye on Azure and hoping that one day soon it might become a real option for us to migrate to in full or part.
CTO – WAYN.COM