The term Software Defined Storage (SDS) has constituted a variety of different interpretations and meanings from various vendors since VMware introduced the concept of the Software Defined Datacenter (SDDC) in 2012.
The SDDC is an end state reference architecture designed to increase agility and operational efficiency in enterprises transitioning to ITaaS. Think of it as transforming datacenters into the likes of the operational model employeed by Google and Facebook today.
The premise to the SDDC is all about the abstraction and pooling of the 3 main food groups; Compute, Storage and Networking of which is overlapped by a common policy driven management framework. If you ever have a chance to take a look into a Facebook datacenter (simple youtube search will reveal all!) all hardware is very much commoditised – The smarts are delivered via a Control plane in software, which is completely abstracted from the Data/IO Plane
What Part does Software Defined Storage Play?
It’s pretty simple – If you look at what we did when we first pioneered the innovative Virtualization Technology, we severed the ties of the Application/OS to the underlying physical hardware. This was seen as a revolutionary breakthrough in the sense that because we completely abstracted the underlying hardware, the Application/OS Container (Virtual Machine) could freely move around from tin to tin (server) irrespective of the make, model, vendor (Yes certain caveats do apply).
Through the means of pooling we were able to consolidate multiple applications onto the one physical server… and with more physical servers; a Clustered resource was developed!
These two key elements; Abstraction and Pooling saved companies billions of dollars worldwide.
Now take the above; if we apply the same operational model of what we did with compute virtualization (abstraction and pooling) and apply that to Storage, then we really start to revolutionise the way we consume and define SLA’s for storage – That is we create what’s called a Virtual Data plane – This Virtual Data Plane is the same concept of the Distributed Virtual Switch (dvSwitch) and such security advancements we’ve made with NSX; Separating out the Control and Data plane to a point where we can integrate with existing SAN/NAS devices, but also Hyperconverged Storage (Hello VSAN!) as well as this new form of storage being adopted: Cloud/object Based Storage.
So you’re thinking integration… What does this entail and what’s next ? – The answer is simple. By integrating with the various forms and elements of storage subsystems to a point where we (VMware) have control; then essentially we are introducing what’s called Virtual Machine Centric Data Services- These data services include VMware’s vision for Data Protection (VMware Advanced\Data Protection), Availability\DR (Replication/SRM) and Performance (IOPS).
In short, by having such Data Services available, means that we can finally align the demands of the application to the underlying storage subsystem. But it doesn’t stop there and this is where the Partner Ecosystem is very important, because not only can we integrate with VMware’s Vision for the above Data Services, but also 3rd Party Services provided by our rich ecosystem.
Once these VM Centric Data Services are created and surfaced then we can control everything via a Policy driven management framework (the key ingredient for SDDC) formally refereed to as the Policy Driven Control Plane, thus provisioning Virtual Machines with a defining blueprint of SLA’s required for the underlying application.
Couple the above with automated self-service provisioning using VMware’s vCloud Automation Centre (vCAC) and suddenly you have a very powerful automation and orchestration engine.
How Does VSAN Deliver SDDS?
If you haven’t had a chance to play with VMware’s VSAN then test drive it here: VMware Hands On Labs because we are delivering SDDS with VSAN right now!
VSAN works at a cluster level by aggregating the local storage within your servers (1 Flash Device + 1 SAS/SATA Magnetic disk minimum per server) across 3 nodes (minimum amount to get started) thereby presenting a single Datastore for consumption by your vSphere hosts.
The technology can be quite compelling in that for one, the operational complexity has been severely reduced when compared to provisioning storage from a traditional SAN/NAS but also the CAPEX/TCO benefits such that you can pay as you go; scale-up, or scale-out linearly – the choice is yours.
So on the topic of SDDS, VSAN is essentially controlled via the Storage Policy Based Manager. Policies which are defined using the VASA Provider – VSAN can encompass:
- The Number of Failures to tolerate
- Number of stripes per object
- Flash Read Cache Reservation
- Object Space Reservation
- Force Provisioning
So what this translates to, is that during provisioning of a Virtual Machine, a user can select/attach a relevant policy (above) which will serve as a blueprint of how the underlying storage for the VM should be consumed (SLA).
Does the VM require a RAID 1 equivalent ? – Setting the Number of failures to tolerate policy to 1 will ensure that there is another host in your cluster containing the objects related to that VM to mitigate against Host, Network and Disk failure.
If for whatever reason the Policy/SLA can no longer be met by the SPBM, then an “Out Of Compliance” error will be reported within the vSphere client/Virtual Machine.
This is just the start of what VSAN has to offer in it’s 1.0 initial release !
What about existing SANs/NAS devices ? – Welcome to vVOLS
The question that I get asked is – How does SDS apply to my existing SAN/NAS based devices – Well the answer to that is Virtual Volumes (vVOLS).
Before we get into what Virtual Volumes are all about, let’s take a look at some of the challenges associated with storage today.
Currently when we design/provision storage, we defined SLA’s based on the characteristics of storage – Typical customers have been balancing performance vs capacity vs availability and recoverability using the traditional static “bins” or a tiering method of Gold, Silver and Bronze Datastores – Gold would typically serve the highest amount of IOPS capability with lower capacity (Cost per GB $$$), a low RPO and some sort of replication. On the other end of the scale – The lowest tier would include a Higher Capacity (NL-SAS/SATA) disk with a lower cost per GB) and a higher RPO/RTO.
Whilst there are multiple vendor auto-tiering technologies in place to address some of the above challanges – We’re still bound to these “static bins” which to the VM Admin/Application owner to decide where their application should live based on what it “might” require rather than what is currently being used. After all, the premise is always “Just in case” as we don’t want to get a call at 2am only to be told that our SAP instance is preferably poorly.
Through virtualization, we’ve learnt that the traditional Storage “one size fits all” isn’t always going to work – Kind of like going to your McDonald’s Restaurant, where you have a set menu vs going to your favourite Burger place (Five Guys Burgers in NYC ) which offers you choice and customisation – After all every customer is different, just like our VM’s
Wouldn’t it be nice if it was a two way street – e.g, Our VM requesting what it requires and the Storage reporting what it’s capabilities are?
This is what Virtual Volumes are all about – At a high level; no longer will we be creating static LUN’s segmented off with a One-Size-fits-all approach. The storage array will report to vSphere what it’s underlying capabilities are, which in return will form the VM Centric Data Services (see above).
Once these capabilities surface, then the Policy Driven Control Plane in conduction with the Storage Policy you assign the Virtual Machine(s) will ensure that the appropriate SLA is being met.
By no means are we configuring your storage array! vVOLS is a new Operational Model/Firmware to support SDS!
So you’ll still need to check whether vVOLS are on your storage vendors roadmap as well as setup and configure your array and expose those capabilities.
Now, a question that I get is, what about LUNS – How is the above possible when we have legacy LUN’s defined – Well the answer to that is that No more LUNS or Volume Management moving forward to support vVOLS – VMDK’s will now appear as Native objects on your SAN and NAS devices – This is how we’re able to get per VM (VMDK) granularity when it comes to defining specific SLA’s – There is never a “one size fits all “approach in today’s world .
Object Based Storage and Virtual Volumes (vVols)
Object base storage is a new storage architecture which manages data as “objects” rather than traditional file systems (NFS) which employee a hierarchy or Block-Based storage, which treats data as blocks, Sectors and Tracks – Object based storage don’t reference a heirachy, in fact every object exists at the SAME level in a flat address space called a storage pool
Each vSphere storage object can include:
- The Virtual Machine home or “namespace directory,
- A swap object (if the virtual machine is powered on),
- Virtual disks/VMDKs,
- Delta-disks created for snapshots (Each delta-disk is an object).
Objects carry a variable amount of metadata and a globally unique identifier (GUI) – To better understand what object storage is; take for instance the ‘chaotic storage’ policy employed by Amazon in their warehouses today – With warehouses spanning over 1 million square feet and over 65000 employees, items are not stacked in a traditional manner where you would have a certain Row dedicated to Books and a shelf purely dedicated to a VMware Press Book on VSAN (It just came out.. get it here); What happens when that books stock depletes ? Well wasted space for instance, until the stock replenishes…But what about when that book becomes super popular and a warehouse staff needs to walk over a KM or Mile just to retrieve it ? Wasted time and Operator Resources! So what Amazon do (quiet clever) is that they dynamically move stock around the warehouse to better utilise warehouse space (whitespace) as well as improve order/operator efficiency.
Using the above analogy, products (objects) are stored in the warehouse (Storage Pool) and referenced by a barcode (Global Unique Identifier). At any point in time, the product(s)\item(s) can be retrieved by referencing the metadata associated with item, which determines its exact location – It doesn’t matter if that product moves into a different physical location ( row, shelf or warehouse) – the item can still be retrieved! Pretty fancy hey ?
Brining it all together – The End Goal!
In today’s storage market, there are so many new storage vendors popping up, each with their own unique value proposition. We’re seeing the Hyper converged storage market (VSAN), Flash based caching mechanisms, different flavours of SAN.. NAS.. all SSD…and the new craze – Cloud Based Storage.
In fact most customers that I visit have more than 1 Storage subsystem\vendor as well as strategy (short/long term) employed.
With Software Defined Storage coupled with Cloud Automation (SDDC) the automated provisioning and placement of applications becomes very compelling.Imagine having our Virtual Machine(s)\files\objects move between storage devices, based on cost, performance, capacity…anywhere… anytime?
This is what Software Defined Storage is all about.