Guide - Cloud Data, Compute and Messaging with F#
Cloud computing relies on leveraging multiple integrated services. Using multiple services required a unique set of technologies and capabilities, and F# excels in this domain. With the recent rise of cloud solutions, it is becoming increasingly easy to deploy multiple services “in the cloud”, expanding what is possible both by storing large amounts of data and running heavy computations distributed across clusters of machines.
The combination of built in support for asynchronous workflows, data processing capabilities, computation expressions, extensible syntax, composability, expressiveness for numeric code and more make F# uniquely suited to develop scalable cloud solutions efficiently.
This guide is an overview of the packages and tools for scalable compute, messaging, storage, and data processing with F#, particularly for taking advantage of cloud-computing resources.
For cloud-hosted web programming and services, refer to the Web Programming Guide.
This guide includes resources related to cloud programming with F#. To contribute to this guide, log on to GitHub, edit this page and send a pull request.
Note that the resources listed below are provided only for educational purposes related to the F# programming language. The F# Software Foundation does not endorse or recommend any commercial products, processes, or services. Therefore, mention of commercial products, processes, or services should not be construed as an endorsement or recommendation.
Resources for Cloud Programming
- Resources for Cloud Programming
- Cloud Platforms
- Scalable Distributed Programming and Messaging
- Scalable Data Programming and NoSQL Databases
Microsoft Azure provides access to Microsoft’s worldwide datacenters through services including virtual machines, geo-redundant storage, database clusters, website deployment and other services.
Using F# on Azure - Microsoft’s comprehensive guide to using F# on Azure.
F# and Azure Functions - Developer reference for using F# with Azure Functions.
Building Web, Cloud, and Mobile Solutions with F# - Book including details on Azure programming with F#
FSharp.Azure.Storage - FSharp.Azure provides an idiomatic F# API to query and modify data in Azure table storage using immutable F# record types.
F# Azure Storage Type Provider - Provides strongly-typed access to Blobs and Tables with automatic schema generation of table schema based off EDM metadata.
Amazon Web Services (AWS) provide a large array of on-demand and managed computing and hosting services. AWS include on-demand and reserved virtual machine instances, a variety of storage options, a content delivery network (CDN), DNS capabilities, and many others. Amazon offers services from multiple data centers around the world.
Amazon offers a .NET SDK for managing the AWS services, described here. This SDK provides facilities for managing storage, compute instances, and other Amazon services.
Some additional resources for using F# and .NET on Amazon’s AWS service:
FSharp.AWS.DynamoDB - an F# wrapper over the standard Amazon.DynamoDB library which allows you to represent table items using F# records and perform updates, queries and scans using F# quotation expressions
F# Template for AWS Lambda - a .Net Core 1.0 project & guide for publishing to AWS Lambda. Lambda doesn’t officially support F#, however the recent support for .Net Core allows us to run compiled F# assemblies
Scalable Distributed Programming and Messaging
Distributed compute problems require a wide range of communication capabilities, ranging from simple command line argument passing to heavily optimized, low-latency interprocess communications. This section lists a wide range of communication libraries available to F#.
FSharp.CloudAgent and F# Mailbox Processor
The F# Mailbox Processor provides an Agent pattern for inter-thread communication directly within the core F# libraries.
FSharp.CloudAgent is a simple-to-use framework that allows the easy creation of distributable pools of workers or agents using F#’s native MailboxProcessor agent framework, using Azure Service Bus to provide a cheap and reliable message bus.
The Akka.NET framework is an open source toolkit and runtime for building highly concurrent, distributed, and fault-tolerant event-driven applications on .NET and Mono. It is used in production systems by its own contributors.
The MBrace framework is an open-source programming model and distributed runtime that enables scalable, fault-tolerant computation and data processing for the .NET/mono frameworks.
The Orleans framework provides a straightforward approach to building distributed high-scale computing applications, without the need to learn and apply complex concurrency or other scaling patterns. It was designed for use in the cloud, and has been used extensively in Microsoft Azure. A simple ‘Hello World’ F# sample also available.
Kafunk - An F# Kafka client.
anaerobic - A simple implementation of a Kafka producer and consumer in F#.
ZeroMQ - A more general .NET binding for ZeroMQ.
MS-MPI - Microsoft’s implementation of the MPI protocol, available on some versions of Windows Server.
Ractor.CLR is a Redis-based distributed actors system.
Scalable Data Programming and NoSQL Databases
F# can be used with many scalable data-storage systems. Some are accessible via the Cloud SDKs outlined above. Some further resources for specific systems are:
Hadoop supports data-intensive distributed applications running on large clusters of commodity hardware. Hadoop derives from Google’s MapReduce and Google File System papers.
hadoop-sharp - CLR (.NET/Mono) interface for Hadoop
HadoopFs - A lightweight F# implementation of the Hadoop Streaming API
Microsoft .NET SDK For Hadoop - Includes LINQ to Hive and other resources
Storm is platform for realtime analytics, online machine learning, continuous computation, distributed RPC, ETL, and more. Capable of running on the same infrastructure as Hadoop clusters, it is scalable, fault-tolerant, guarantees your data will be processed, and is easy to set up and operate.
FsStorm - F# library for implementation of Apache Storm components and definition of topologies in F# DSL, with fine-grained access to multilang via Json AST.
FsShelter - reimagined FsStorm that favours static typing, simplicity and modularity.
Hosted Storm on Azure HDInsight - Storm as a service on Azure HDInsight.
Riak is a NoSQL database implementing the principles from Amazon’s Dynamo paper:
Exploring Riak with F# Explores the use of Riak from F# (Part I)
Exploring Riak with F# and CorrugatedIron Explores the use of Riak from F# (Part II)
Using Riak MapReduce with F# Explores the use Riak from F# (Part III)
Cassandra is a distributed database management system designed to handle large amounts of data across many commodity servers, providing high availability with no single point of failure.
Aquiles - A .NET Client for Apache Cassandra version 0.6.X or above using Thrift API.
Cassandraemon - A LINQ Provider for Apache Cassandra
cassandra-sharp - A high performance .NET driver for Apache Cassandra
FluentCassandra - A .NET library for accessing Apache Cassandra
RavenDB is a scalable document-oriented database.
- F# Client API - The F# client API is a thin wrapper around the standard RavenDB client API, that provides a small set of combinators and a computation builder that hides the complexity of dealing with Linq expressions from F#. This documentation assumes some familiarity with the basics of RavenDB.
MongoDB is a cross-platform document-oriented NoSQL database system.
Mongo DB - MongoDB bindings for .NET
Enhancing the F# developer experience with MongoDB - Extra options for the F# developer using MongoDB
Neo4j is an embedded, disk-based, fully transactional persistence engine that stores data structured in graphs rather than in tables.