Archive for the 'FileStream' Category

05th Jun 2008

TechEd 2008 Developers - Data Platform Keynote EDM/SSDS/SQL Data Types

By Zach Skyles Owens

One of the biggest challenges we faced when designing this demo was trying to make the breadth of Microsoft’s Data Platform technologies easily understandable in 5 minutes.  The major architecture components are SQL Server 2008, SQL Server Data Services (SSDS), SQL Server Compact and Sync; which I outlined in a previous post.  We’ve been following the blogosphere as closely as possible and, understandably so, one of the areas we’ve seen a bit of confusion around is the type support and conversion between the Entity Data Model, SSDS and SQL Server 2008.

There are a lot of moving parts here so I’ll do my best at explaining how everything was integrated and how we got around some of the differences in type support of the data platform technologies.

Web Application

image Bloggers use this application to submit geo-tagged articles and images which are stored in SSDS.  We embedded a Virtual Earth control which the users use to manually geo-tag their content.  Those who have been following SSDS closely may have noted that SSDS does not currently support blob storage or spatial types. 

  • Blob Storage in SSDS - This is the number one feature that customers are asking for right now with SSDS.  Until SSDS supports blobs our current workaround was to Base64 Encode the images for storage in SSDS.
  • Spatial Storage in SSDS - Currently SSDS does not have support for Spatial types.  This was an easy work around for us by converting the spatial POINT coordinates to the Well Known Text (WKT) format and storing them as text. 

Mobile Application

image In this scenario we are also using a Windows Mobile device that allows bloggers to submit photos which are automatically sent to SSDS via the Microsoft Sync Framework.  Here are some key points:

  • A SQL Server Compact database stores the application’s meta-data
  • Images are stored on the device’s file system
  • Geo-tagging is pulled directly from the GPS enabled device.  The app caches the last known GPS coordinates in case of lack of GPS connectivity.
  • The Sync provider running on the device converts the geo-data to WKT and the image to Base64 text as in the web app

Sync from SSDS to SQL Server

sync2 Our application uses a powerful WPF desktop application connected to a local SQL Server 2008 database.  Since SQL Server 2008 supports Spatial we have the ability to run high performance spatial queries which aren’t currently possible in SSDS.  FILESTREAM allows us to use the high-performance of the file system for binary file storage along with transactional consistency and great manageability of the database.  Type conversion here was very simple and outlined below:

  • Sync and FILESTREAM - The Sync Provider sitting on our SQL Server database pulls the Base64 binary data from SSDS and inserts it into a varbinary(max) FILESTREAM column in our database.
  • Sync and Spatial - This Sync Provider inserts the WKT POINT data into a SQL Server GEOGRAPHY type.

EDM, Spatial and FILESTREAM

  imageimageOur WPF application uses the Entity Data Model (EDM) to provide the application developers with a more natural business representation of the data. This allows the database model and application data model to evolve independently. Currently the EDM has limited support for FILESTREAM and does not natively support the new SQL Server 2008 Spatial types (GEOGRAPHY and GEOMETRY).  These were also very easy to work around in the following ways.

  • EDM and FILESTREAM - Currently the EDM treats FILESTREAM in the same way it treats any other varbinary(max) column.  You get the transaction consistency and manageability of the database.  It interacts with the FILESTREAM data through T-SQL so you don’t get the Win32 streaming performance that FILESTREAM has the ability to provide.  If that type of read/write performance is needed you can easily write a section of code that interacts with traditional database connections and SQL.
  • EDM and Spatial - Currently the EDM does not support the new Spatial types.   Our WPF application had two main requirements for Spatial: high performance queries and showing the Spatial meta-data.  We wrote a stored procedure for the queries and mapped a calculated column which converted the spatial data into WKT for displaying the meta-data.

Summary

There were definitely some things that we had to consider when building this application but in the end none were major barriers.  The application works great and was an enjoyable development experience. 

Posted in EDM, FileStream, MSDN, SQL Server, SSDS, Spatial, Sync Framework, TechEd, WPF | No Comments »

03rd Jun 2008

TechEd 2008 Developers - Data Platform Keynote Demo Architecture

By Zach Skyles Owens

Just minutes ago at TechEd 2008 Developers in Orlando Dave Campbell, Technical Fellow, was on stage with Bill Gates doing the Data Platform demo which included a host of exciting Microsoft Technologies.

  • SQL Server 2008 - Supports any data
  • SQL Server Data Services (SSDS) - Quickly provisioning for unpredictable scale
  • Microsoft Sync Framework - Keeping all data synchronized

Building this demo has been one of the most rewarding experiences in my career.  Everyone involved has put in 110%.

When we set out to build this demo we spent a lot of time making sure that the architecture was something that we have heard customers asking for.  Since SSDS is such a new piece of the Microsoft Data Platform going through this process was an interesting experience.

In this article I will quickly highlight the architecture of this application and describe why we made some of the architecture decisions.

Scenario

This application was built around a fictitious company called Trey Research.  Trey Research is a news agency that has launched a new strategy aimed at turning bloggers into paid journalists by paying them for their articles and photos.  Bloggers from around the world submit articles and images through either a web application or Windows Mobile app.  New analysts at Trey Research find the best articles and images for a given geographic area of interest, combine them into a story and sell them to companies like MSNBC; paying the content creators in the process.

Architecture

Here is a high-level overview of the architecture.

image

 

SQL Server 2008 and WPF

imageSo starting from the News Analyst WPF App and SQL Server 2008.  We chose to use local SQL Server database connected to a WPF desktop application for a number of reasons including.

  • High Performance of a Local Database
  • Powerful Analytics of the SQL Server platform
  • Ability to execute Spatial Queries to search for relevant content
  • Storage of all content types including geo-tagged text and images

News Analysts at Trey Research can search for content with a power UI including Spatial and Time based queries.  SQL Server uses powerful analytics on the back end to determine the target demographic for the content.  Some of the exciting technologies being used are:

  • FILESTREAM to store binary data
  • New DateTimeOffset type used to store the date and time a photo was taken, which preserves time zone information
  • Spatial queries and indexes which allow for the fast retrieval of geo-tagged data.
  • SQL Server Reporting Services to provide rich visualization of analytical data.

SQL Server Data Services, Web Application and Mobile Device

imageAll of this content is submitted via a web site or via a Windows Mobile application.  The content submitted via the web site or mobile app are stored in SSDS.  Their core competency is managing news content, not building Internet scale data centers.  Trey Research decided to use SSDS as the data storage platform for a number of reasons:

  • To quickly provision for the unpredictable scale of what is turning out to be a very popular web site
  • The nature of news is that a large event can produce huge spikes in traffic, so they are relying on Microsoft’s global infrastructure and Service Level Agreements
  • SSDS is acting as a Data Hub where web and mobile devices submit content

Microsoft Sync Framework

The final piece of magic in this application is the Microsoft Sync Framework.  Sync is used to pull data down from SSDS into the local SQL Server database in addition to moving data from the mobile device into SSDS.  The sync framework provides a powerful platform for dealing with things like conflict detection, etc.

Summary

As you can see there were a number of architecture decisions that had to be made.  Trey Research is using many of the powerful features of the Microsoft Data Platform to quickly build this application.  I’ll continue to post information about this demo.

Posted in Business Intelligence, EDM, FileStream, MSDN, SQL Server, SSDS, Spatial, Sync Framework, TechEd, WPF | No Comments »

03rd Mar 2008

To FILESTREAM or not to FILESTREAM… That is the question.

Roger and I have been delivering a demo which highlights Spatial and FILESTREAM features of SQL Server 2008.  One of the common things we’ve heard from developers is…

For years we’ve been told that large binary files should never be stored in the database…  Are you telling us to start storing these files in the database now?  If so, why?  Are we just supposed to throw this best practice out the window?

This is really a great question which has prompted some interesting discussions.  Obviously the answer is not black and white. So let’s start by looking at what has changed.

 

SQL Server 2008 is now a very powerful engine for storing binary files. 

  • These files can be accessed through high performance Win32 streaming API’s in addition to T-SQL.
  • These files are managed by SQL Server in their own file groups which can be backed up restored along with the rest of your SQL Server data.  On the flip side you aren’t required to backup and restore these file groups.
  • Reading and writing these files can now be part of a database transaction.

So you might be thinking to yourself…

Sounds great!!!  Let’s start storing all of our binary data in SQL Server.

Well, there are some considerations to be made before signing up to rewrite your app to take advantage of FILESTREAM.  Here are some of the main considerations.

Do other applications need direct access to your binary files?

If you read my article about writing files to FILESTREAM you probably noticed that you have to go through SQL Server to access the data in FILESTREAM.  There is no concept of
OpenFile(”C:\Path_To_My_File\File_Name.docx”)

Does your architecture require database mirroring?

Database mirroring does not yet support FILESTREAM.

Those are just a couple of the things to think about.  I’d recommend checking out our FILESTREAM sample on CodePlex and make some decisions for yourself. 

FILESTREAM is a great technology and we are really excited to see how developers incorporate it into their applications.  Feel free to post comments here about your experience integrating FILESTREAM into your architecture.

Posted in FileStream, MSDN, SQL Server | No Comments »

03rd Mar 2008

SQL Server 2008 FILESTREAM and WPF MediaElement - Part 2 (Writing FILESTREAM Data)

codeplex-logo Wow…  It’s been a long time coming.  I promised that I would explain in more detail how to write FILESTREAM data to SQL Server 2008.  This is the second article in a series and uses the sample published on the SQL Server Community Samples site on CodePlex.

Writing data to a varbinary(max) FILESTREAM column in SQL is a bit more involved then just opening a file on the filesystem.  SQL Server needs to manage this operation within a transaction which adds a bit of complexity.  Here are the basic steps…  These steps apply both to reading and writing.

  1. Start a SQL Server transaction
  2. Insert a row into the table containing metadata
  3. Select the PathName from SQL Server which will be used to get a handle
  4. Open a handle for writing using sqlncli10.dll
  5. Use that handle within System.IO classes
  6. Commit the transaction

Now that the basic steps are laid out, let’s take a closer look.

// Start up a database transaction.
SqlTransaction txn = cxn.BeginTransaction();

No need for explanation there.

// Insert a row into the table to create a handle for streaming write.
SqlCommand cmd = new SqlCommand(“INSERT [dbo].[media]([mediaId], [fileName],         [contentType]) VALUES( @mediaId, @fileName, @contentType);”, cxn, txn);

This is worth a bit talking about.  Why do you need to insert a row with metadata?  The answer is that in order to get a handle to the FILESTREAM column the row cannot have a NULL value in the FILESTREAM column.  This took some trial and error to discover. 

If you look closely at the create table script in the sample code you will see that the varbinary(max) FILESTREAM column default is set to a zero byte binary value.

file varbinary(max) FILESTREAM DEFAULT(0x)

This should make a bit more sense once we look at the next step.

// Get a filestream PathName token and filestream transaction context.
// These items will be used to open up a file handle against the empty blob instance.
cmd = new SqlCommand(“SELECT [file].PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT()         FROM [dbo].[media] WHERE [mediaId] = @mediaId;”, cxn, txn);
...
// Read in results of file.PathName()
SqlString sqlFilePath = rdr.GetSqlString(0);

So…  Here’s where I think things get interesting.  You can see that the SELECT statement above calls the PathName() method of the FILESTREAM column.  Here’s an example of the results from the query above. Note the use of UDTs and CLR!

\\ZOWENS-NB3\MSSQLSERVER\v1\FilestreamWpfHttp\dbo\media\file\4C3C9C2D-8268-43FF-8317-D507319FE21C

This is a “virtual” path managed by SQL Server.  It consists of \\COMPUTER_NAME, followed by a configurable handler \MSSQLSERVER…

Now what?

// Get a Win32 file handle to the empty blob instance using SQL Native Client call.
// This is required in order to write to the empty blob instance.
SafeFileHandle handle = SqlNativeClient.OpenSqlFilestream(
        sqlFilePath.Value,
        SqlNativeClient.DESIRED_ACCESS_WRITE,
        0,
        transactionToken.Value,
        (UInt32)transactionToken.Value.Length,
        new SqlNativeClient.LARGE_INTEGER_SQL(0));

The code above uses a simple C# class “SqlNativeClient” that wraps the sqlncli10.dll I mentioned above.  This C# class is key to working with FILESTREAM in managed code.  You can see that we passed in the sqlFilePath variable from the PathName() query.

// Open up a new stream to write the file to the blob.
System.IO.FileStream destBlob = new System.IO.FileStream(handle, FileAccess.Write);

“Old school” System.IO file manipulation using the handle obtained from the SqlNativeClient class above.

// Commit transaction
txn.Commit();

There you have it…  It’s not rocket science but there are a few tricks.

Posted in FileStream, MSDN, SQL Server | No Comments »

18th Dec 2007

FILESTREAM Support in SQL Server Express

A couple great questions came up last week when discussing FILESTREAM with some people within the Microsoft field which I thought were worth sharing.

 

Q: Will SQL Server 2008 Express edition support FILESTREAM?

A: It sure will!

 

Q: Will the 4GB limit of Express apply to data stored in the FILESTREAM?

A: Nope…  Go ahead and store large binary files!

 

As you may have noticed I’m really excited about binary data becoming a first class citizen in SQL Server 2008.  Once you dig into it I think you will agree.

Posted in FileStream, MSDN, SQL Server | No Comments »