Using Azure Storage Emulator Command-Line Tool: WAStorageEmulator.exe

Starting from version 3.0 of the emulator, a few things have changed and lots of people are not aware of this. When you launch the Storage Emulator now, you will see a command prompt pop up. I wanted to write this short blog post to just to give you a head start.
2014-09-15 09:28
Tugberk Ugurlu


When you download the Azure SDK for Visual Studio, it brings down bunch of stuff for you such as Azure Storage and Compute Emulator. With a worker or web role project in Visual Studio, we can get the both emulators up and running by simply firing up the project. However, if we are not working with a web or worker role, we need a way to fire up the storage emulator by ourselves and it is actually pretty easy. Starting from version 3.0 of the emulator, a few things have changed and lots of people are not aware of this. I wanted to write this short blog post to just to give you a head start.

When you launch the Storage Emulator now, you will see a command prompt pop up.

Screenshot (13)

image

This is WAStorageEmulator.exe and it is the storage emulator command line tool which allows you perform bunch of operations such as starting/stopping the emulator and querying the status of the emulator. You can either run this command prompt as I did above or you can navigate to c:\Program Files(x86)\Microsoft SDKs\Azure\Storage Emulator\ directory and find WAStorageEmulator.exe there. you can read up on Storage Emulator Command-Line Tool Reference on MSDN to find out what commands are available. What I would like to point out is the fact that you can now run the emulator inprocess through the command prompt which is quite nice:

image

The other thing is that you can now quite easily get the storage emulator up and running on your integration tests. You can even reset the whole storage account at the beginning of your test, start it and stop it at the end. Check out Using the Azure Storage Emulator for Development and Testing section on MSDN for further details.

A Gentle Introduction to Azure Search

Microsoft Azure team released Azure Search as a preview product a few days ago, an hosted search service solution by Microsoft. Azure Search is a suitable product if you are dealing with high volume of data (millions of records) and want to have efficient, complex and clever search on those chunk of data. In this post, I will try to lay out some fundamentals about this service with a very high level introduction.
2014-09-10 12:02
Tugberk Ugurlu


With many of the applications we build as software developers, we need our data to be exposed and we want that data to be in an easy reach so that the user of the application can find what they are looking for easily. This task is especially tricky if you have high amount of data (millions, even billions) in your system. At that point, the application needs to give user a great and flawless experience so that the user can filter down the results based on what they are actually looking for. Don't we have solutions to address this problems? Of course, we do and solutions such as Elasticsearch and Apache Solr are top notch problem solvers for this matter. However, hosting these products on your environment and making them scalable is completely another job.

To address these problems, Microsoft Azure team released Azure Search as a preview product a few days ago, an hosted search service solution by Microsoft. Azure Search is a suitable product if you are dealing with high volume of data (millions of records) and want to have efficient, complex and clever search on those chunk of data. If you have worked with a search engine product (such as Elasticsearch, Apache Solr, etc.) before, you will be too much comfortable with Azure Search as it has some many similar features. In fact, Azure Search is on top of Elasticsearch to provide its full-text search function. However, you shouldn't see this brand-new product as hosted Elasticsearch service on Azure because it has its completely different public interface.

In this post, I will try to lay out some fundamentals about this service with a very high level introduction. I’m hoping that it’s also going to be a starting point for me on Azure Search blog posts :)

Getting to Know Azure Search Service

When I look at Azure Search service, I see it as four pieces which gives us the whole experience:

  • Search Service
  • Search Unit
  • Index
  • Document

Search service is the highest level of the hierarchy and it contains Provisioned search unit(s). Also, a few concepts are targeting the search service such as authentication and scaling.

Search units allow for scaling of QPS (Queries per second), Document Count and Document Size. This also means that search units are the key concept for high availability and throughput. As a side note, high availability requires at least 3 replicas for the preview.

Index is the holder for a collection of documents based on a defined schema which specifies the capabilities of the Index (we will touch on this schema later). A search service can contain multiple indexes.

Lastly, Document is the actual holder for the data, based on the index schema, which the document itself lives in. A document has a key and this key needs to be unique within the index. A document also has fields to represent the data. Fields of a document contain attributes and those attributes define the capabilities of the field such as whether it can be used to filter the results, etc. Also note that number of documents an index can contain is limited based on the search units the service has.

Windows Azure Portal Experience

Let's first have a look at the portal experience and how we can get a search service ready for our use. Azure Search is not available through the current Microsoft Azure portal. It's only available through the preview portal. Inside the new portal, click the big plus sign at the bottom left and then click "Everything".

Screenshot 2014-09-06 11.55.45

This is going to get you to "Gallery". From there click "Data, storage, cache + backup" and then click "Search" from the new section.

Screenshot 2014-09-06 11.59.16

You will have a nice intro about the Microsoft Azure Search service within the new window. Hit "Create" there.

Keep in mind that service name must only contain lowercase letters, digits or dashes, cannot use dash as the first two or last one characters, cannot contain consecutive dashes, and is limited between 2 and 15 characters in length. Other naming conventions about the service has been laid out here under Naming Conventions section.

When you come to selecting the Pricing Tier, it's time to make a decision about your usage scenario.

Screenshot 2014-09-06 12.06.52

Now, there two options: Standard and Free. Free one should be considered as the sandbox experience because it's too limiting in terms of both performance and storage space. You shouldn't try to evaluate the Azure Search service with the free tier. It's, however, great for evaluating the HTTP API. You can create a free service and use this service to run your HTTP requests against.

The standard tier is the one you would like to choose for production use. It can be scaled both in terms of QPS (Queries per Second) and document size through shards and replicas. Head to "Configure Search in the Azure Preview portal" article for more in depth information about scaling.

When you are done setting up your service, you can now get the admin key or the query key from the portal and start hitting the Azure Search HTTP (or REST, if you want to call it that) API.

Azure Search HTTP API

Azure Search service is managed through its HTTP API and it's not hard to guess that even the Azure Portal is using its API to manage the service. It's a lightweight API which understands JSON as the content type. When we look at it, we can divide this HTTP API into three parts:

Index Management part of the API allows us managing the indexes with various operations such as creating, deleting and listing the indexes. It also allow us to see some index statistics. Creating the index is probably going to be the first operation you will perform and it has the following structure:

POST https://{search-service-name}.search.windows.net/indexes?api-version=2014-07-31-Preview HTTP/1.1
User-Agent: Fiddler
api-key: {your-api-key}
Content-Type: application/json
Host:{search-service-name}.search.windows.net

{
	"name": "employees",
	"fields": [{
		"name": "employeeId",
		"type": "Edm.String",
		"key": true,
		"searchable": false
	},
	{
		"name": "firstName",
		"type": "Edm.String"
	},
	{
		"name": "lastName",
		"type": "Edm.String"
	},
	{
		"name": "age",
		"type": "Edm.Int32"
	},
	{
		"name": "about",
		"type": "Edm.String",
		"filterable": false,
		"facetable": false
	},
	{
		"name": "interests",
		"type": "Collection(Edm.String)"
	}]
}

With the above request, you can also spot a few more things which are applied to every API call we make. There is a header we are sending with the request: api-key. This is where you are supposed to put your api-key. Also, we are passing the API version through a query string parameter called api-version. Have a look at the Azure Search REST API MSDN documentation for further detailed information.

With this request, we are specifying the schema of the index. Keep in mind that schema updates are limited at the time of this writing. Although existing fields cannot be changed or deleted, new fields can be added at any time. When a new field is added, all existing documents in the index will automatically have a null value for that field. No additional storage space will be consumed until new documents are added to the index. Have a look at the Update Index API documentation for further information on index schema update.

After you have your index schema defined, you can now start populating your index with Index Population API. Index Population API is a little bit different and I honestly don’t like it (I have a feeling that Darrel Miller won’t like it, too :)). The reason why I don’t like it is the way we define the operation. With this HTTP API, we can add new document, update and remove an existing one. However, we are defining the type of the operation inside the request body which is so weird if you ask me. The other weird thing about this API is that you can send multiple operations in one HTTP request by putting them inside a JSON array. The important fact here is that those operations don’t run in transaction which means that some of them may succeed and some of them may fail. So, how do we know which one actually failed? The response will contain a JSON array indicating each operation’s status. Nothing wrong with that but why do we reinvent the World? :) I would be more happy to send batch request using the multipart content-type. Anyway, enough bitching about the API :) Here is a sample request to add a new document to the index:

POST https://{search-service-name}.search.windows.net/indexes/employees/docs/index?api-version=2014-07-31-Preview HTTP/1.1
User-Agent: Fiddler
api-key: {your-api-key}
Content-Type: application/json
Host: {search-service-name}.search.windows.net

{
	"value": [{
		"@search.action": "upload",
		"employeeId": "1",
		"firstName": "Jane",
		"lastName": "Smith",
		"age": 32,
		"about": "I like to collect rock albums",
		"interests": ["music"]
	}]
}

As said, you can send the operations in batch:

POST https://{search-service-name}.search.windows.net/indexes/employees/docs/index?api-version=2014-07-31-Preview HTTP/1.1
User-Agent: Fiddler
api-key: {your-api-key}
Content-Type: application/json
Host: {search-service-name}.search.windows.net

{
	"value": [{
		"@search.action": "upload",
		"employeeId": "2",
		"firstName": "Douglas",
		"lastName": "Fir",
		"age": 35,
		"about": "I like to build cabinets",
		"interests": ["forestry"]
	},
	{
		"@search.action": "upload",
		"employeeId": "3",
		"firstName": "John",
		"lastName": "Fir",
		"age": 25,
		"about": "I love to go rock climbing",
		"interests": ["sports", "music"]
	}]
}

Check out the great documentation about index population API to learn about it more.

Lastly, there are query and lookup APIs where you can use OData 4.0 expression syntax to define your query. Go and check out its documentation as well.

Even if the service is so new, there are already great things happening around it. Sandrino Di Mattia has two cool open source projects on Azure Search. One is RedDog.Search .NET Client and the other is the RedDog Search Portal which is a web based UI tool to manage your Azure Search service. The other one is from Richard Astbury: Azure Search node.js / JavaScript client. I strongly encourage you to check them out. There are also two great video presentations about Azure Search by Liam Cavanagh, a Senior Program Manager in the Azure Data Platform Incubation team at Microsoft.

Stop what you are doing and go watch them if you care about Azure Search. It will give you a nice overview about the product and those videos could be your starting point.

You can also view my talk on AzureConf 2014 about Azure Search:



I proud to say that I will be giving a talk on Azure Search, fully managed, cloud-based service that allows developers to build rich search applications using REST APIs, at AzureConf 2014! AzureConf is a community event hosted by Microsoft and it will be streamed live on the 21st of October, 2014.

azureconf_1021_banner_743x200

Here is what AzureConf actually is all about:

On October 21st, 2014, Microsoft will be hosting AzureConf, another free event for the Azure community. This event will feature a keynote presentation by Scott Guthrie, along with numerous sessions executed by Azure community members. Streamed live for an online audience on Channel 9, the event will allow you to see how developers just like you are using Azure to develop robust, scalable applications on Azure. Community members from all over the world will join known speakers such as Michael Collier, Mike Martin, Rick Garibay, and Chris Auld in the Channel 9 studios to present their own inventions and experiences. Whether you’re just learning Microsoft Azure or you've already achieved success on the platform, you won’t want to miss this special event.

As said, the event will be streamed live and you need to register through the AzureConf web site. You should also check out the defined schedule, speakers list and AzureConf 2014 Lanyrd page. 

I wouldn't miss this awesome event. Seriously, add this to your calendars :)

Microsoft Turkey Summer School 2014 - ASP.NET Web API and SignalR Talk

In context of Microsoft Turkey Summer School 2014, I had a chance to give a talk on ASP.NET Web API and ASP.NET SignalR a few days ago at Microsoft Turkey Office. Here is the slides, recording video and references from the talk.
2014-08-17 12:04
Tugberk Ugurlu




I have been designing HTTP APIs (Web APIs, if you want to call it that) for a fair amount of time now and I have been handling the HTTP DELETE operations the same way every time. Here is a sample.

HTTP GET Request to get the car:

GET http://localhost:25135/api/cars/3 HTTP/1.1
User-Agent: Fiddler
Accept: application/json
Host: localhost:25135

HTTP/1.1 200 OK
Content-Type: application/json; charset=utf-8
Date: Wed, 25 Jun 2014 12:36:48 GMT
Content-Length: 68

{"Id":3,"Make":"Make3","Model":"Model1","Year":2009,"Price":67437.0}

HTTP DELETE Request to delete the car:

DELETE http://localhost:25135/api/cars/3 HTTP/1.1
User-Agent: Fiddler
Accept: application/json
Host: localhost:25135

HTTP/1.1 204 No Content
Date: Wed, 25 Jun 2014 12:36:52 GMT

Now we can see that the car is removed as I received 204 for my HTTP DELETE request. Let's send another HTTP DELETE to same resource.

HTTP DELETE Request to delete the car and receive 404:

DELETE http://localhost:25135/api/cars/3 HTTP/1.1
User-Agent: Fiddler
Accept: application/json
Host: localhost:25135

HTTP/1.1 404 Not Found
Date: Wed, 25 Jun 2014 12:36:52 GMT
Content-Length: 0

I received 404 because /api/cars/3 is not a URI which points to a resource in my system. This is not a problem at all and it's a correct way of handling the case as I have been doing for long time now. The idempotency is also preserved because how many times you send this HTTP DELETE request, additional changes to the state of the server will not occur because the resource is already removed. So, the additional HTTP DELETE requests will just do nothing.

However, here is the question in my mind: what is the real intend of the HTTP DELETE request?

  • Ensuring the resource is removed with the given HTTP DELETE request.
  • Ensuring the resource is removed.

Here is what HTTP 1.1 spec says about HTTP DELETE:

The DELETE method requests that the origin server delete the resource identified by the Request-URI. This method MAY be overridden by human intervention (or other means) on the origin server. The client cannot be guaranteed that the operation has been carried out, even if the status code returned from the origin server indicates that the action has been completed successfully. However, the server SHOULD NOT indicate success unless, at the time the response is given, it intends to delete the resource or move it to an inaccessible location.

A successful response SHOULD be 200 (OK) if the response includes an entity describing the status, 202 (Accepted) if the action has not yet been enacted, or 204 (No Content) if the action has been enacted but the response does not include an entity.

If the request passes through a cache and the Request-URI identifies one or more currently cached entities, those entries SHOULD be treated as stale. Responses to this method are not cacheable.

I don't know about you but I'm unable to figure out which two of my above intends is specified here. However, I think that the HTTP DELETE request’s intend is to ensure that the resource is removed and cannot be accessible anymore. What does this mean to my application? It means that if an HTTP DELETE operation succeeds, return a success status code (200, 202 or 204). If the resource is already removed and you receive an HTTP DELETE request for that resource, return 200 or 204 in that case; not 404. This seems more semantic to me and it is certainly be more easy for the API consumers.

What do you think?

References