Bringing big data together with other data systems can yield amazing insights, particularly when presented in ways that users can see information and make decisions that affect the business. As new technologies like Hadoop, Azure Data Lake Analytics, and R enable cheaper distributed processing and improved analytical capabilities, the bar from getting insights from new types of data, as well as ever increasing volumes of data, has come down.
Crafting Bytes helps customers realize results through business technology solutions on Microsoft® technologies.
The Crafting Bytes’ Big Data Group drives high priority customer initiatives, leveraging Azure data services to solve the biggest and most complex data challenges faced by the enterprise customers. We are involved in Azure Data Services technical architecture: data architecture design sessions, implementation projects and/or Proofs of Concepts.
Our consultants have a wide range of experience on all Azure data products including:
- Azure SQL Database
- Azure Cosmos DB Document DB
- SQL Data Warehouse
- Azure Machine Learning
- Stream Analytics
- Data Factory
- Event Hubs and Notification Hubs
We architect scalable data processing and analytics solutions. We provide technical feasibility for Big Data storage, processing and consumption e.g., development of enterprise Data Lake strategy, heterogeneous data management, Polyglot Persistence, decision support over Data Lake.
The three main Azure Big Data products are:
Azure SQL Datawarehouse
This product is excellent for curated data that sits in a relational format. It has several key benefits, including using all the tooling to interact with your data as you would an on-premise SQL Database. SQL Server Management Studio, Visual Studio, and SSIS works with this product exactly the same as other SQL Server products. In addition, SQL DW uses PolyBase to quicky ingest text files into your relational data model.
Azure Data Lake Analytics
This is a Platform-as-a-Service (PAAS) offering in Azure that only charges you per job execution. This is a key benefit because it allows you to control your degree of parallelism on a per job basis. This means that your data can stay in one location and you can run multiple jobs over it. Some jobs can have more parallelism (and thus be more expensive) than other jobs. Also, as job execution time increases, there is an easy and predictable remedy for that.
This is Azure’s Hortworks Hadoop implementation in the cloud. This will work with all of your existing Hadoop apps without a problem, including: Hive, Pig, Storm, Kafka, etc. This is an easy path to bring an on-premise Hadoop installation into Azure. It is massively scalable.