Skip to main content

Posts tagged with 'Couchbase'

This is a repost that originally appeared on the Couchbase Blog: JSON Data Modeling for RDBMS Users.

JSON data modeling is a vital part of using a document database like Couchbase. Beyond understanding the basics of JSON, there are two key approaches to modeling relationships between data that will be covered in this blog post.

The examples in this post will build on the invoices example that I showed in CSV tooling for migrating to Couchbase from Relational.

Imported Data Refresher

In the previous example, I started with two tables from a relational database: Invoices and InvoicesItems. Each invoice item belongs to an invoice, which is done with a foreign key in a relational database.

I did a very straightforward (naive) import of this data into Couchbase. Each row became a document in a "staging" bucket.

Data imported from CSV

Next, we must decide if that JSON data modeling design is appropriate or not (I don’t think it is, as if the bucket being called "staging" didn’t already give that away).

Two Approaches to JSON data modeling of relationships

With a relational database, there is really only one approach: normalize your data. This means separate tables with foreign keys linking the data together.

With a document database, there are two approaches. You can keep the data normalized or you can denormalize data by nesting it into its parent document.

Normalized (separate documents)

An example of the end state of the normalized approach represents a single invoice spread over multiple documents:

key - invoice::1
{ "BillTo": "Lynn Hess", "InvoiceDate": "2018-01-15 00:00:00.000", "InvoiceNum": "ABC123", "ShipTo": "Herman Trisler, 4189 Oak Drive" }

key - invoiceitem::1811cfcc-05b6-4ace-a52a-be3aad24dc52
{ "InvoiceId": "1", "Price": "1000.00", "Product": "Brake Pad", "Quantity": "24" }

key - invoiceitem::29109f4a-761f-49a6-9b0d-f448627d7148
{ "InvoiceId": "1", "Price": "10.00", "Product": "Steering Wheel", "Quantity": "5" }

key - invoiceitem::bf9d3256-9c8a-4378-877d-2a563b163d45
{ "InvoiceId": "1", "Price": "20.00", "Product": "Tire", "Quantity": "2" }

This lines up with the direct CSV import. The InvoiceId field in each invoiceitem document is similar to the idea of a foreign key, but note that Couchbase (and distributed document databases in general) do not enforce this relationship in the same way that relational databases do. This is a trade-off made to satisfy the flexibility, scalability, and performance needs of a distributed system.

Note that in this example, the "child" documents point to the parent via InvoiceId. But it could also be the other way around: the "parent" document could contain an array of the keys of each "child" document.

Denormalized (nested)

The end state of the nested approach would involve just a single document to represent an invoice.

key - invoice::1
{
  "BillTo": "Lynn Hess",
  "InvoiceDate": "2018-01-15 00:00:00.000",
  "InvoiceNum": "ABC123",
  "ShipTo": "Herman Trisler, 4189 Oak Drive",
  "Items": [
    { "Price": "1000.00", "Product": "Brake Pad", "Quantity": "24" },
    { "Price": "10.00", "Product": "Steering Wheel", "Quantity": "5" },
    { "Price": "20.00", "Product": "Tire", "Quantity": "2" }
  ]
}

Note that "InvoiceId" is no longer present in the objects in the Items array. This data is no longer foreign—​it’s now domestic—​so that field is not necessary anymore.

JSON Data Modeling Rules of Thumb

You may already be thinking that the second option is a natural fit in this case. An invoice in this system is a natural aggregate-root. However, it is not always straightforward and obvious when and how to choose between these two approaches in your application.

Here are some rules of thumb for when to choose each model:

Table 1. Modeling Data Cheat Sheet
If …​Then consider…​

Relationship is 1-to-1 or 1-to-many

Nested objects

Relationship is many-to-1 or many-to-many

Separate documents

Data reads are mostly parent fields

Separate document

Data reads are mostly parent + child fields

Nested objects

Data reads are mostly parent or child (not both)

Separate documents

Data writes are mostly parent and child (both)

Nested objects

Modeling example

To explore this deeper, let’s make some assumptions about the invoice system we’re building.

  • A user usually views the entire invoice (including the invoice items)

  • When a user creates an invoice (or makes changes), they are updating both the "root" fields and the "items" together

  • There are some queries (but not many) in the system that only care about the invoice root data and ignore the "items" fields

Then, based on that knowledge, we know that:

  1. The relationship is 1-to-many (a single invoice has many items)

  2. Data reads are mostly parent + child fields together

Therefore, "nested objects" seems like the right design.

Please remember that these are not hard and fast rules that will always apply. They are simply guidelines to help you get started. The only "best practice" is to use your own knowledge and experience.

Transforming staging data with N1QL

Now that we’ve done some JSON Data Modeling exercises, it’s time to transform the data in the staging bucket from separate documents that came directly from the relational database to the nested object design.

There are many approaches to this, but I’m going to keep it very simple and use Couchbase’s powerful N1QL language to run SQL queries on JSON data.

Preparing the data

First, create a "operation" bucket. I’m going to transform data and move it to from the "staging" bucket (containing the direct CSV import) to the "operation" bucket.

Next, I’m going to mark the 'root' documents with a "type" field. This is a way to mark documents as being of a certain type, and will come in handy later.

UPDATE staging
SET type = 'invoice'
WHERE InvoiceNum IS NOT MISSING;

I know that the root documents have a field called "InvoiceNum" and that the items do not have this field. So this is a safe way to differentiate.

Next, I need to modify the items. They previously had a foreign key that was just a number. Now those values should be updated to point to the new document key.

UPDATE staging s
SET s.InvoiceId = 'invoice::' || s.InvoiceId;

This is just prepending "invoice::" to the value. Note that the root documents don’t have an InvoiceId field, so they will be unaffected by this query.

After this, I need to create an index on that field.

Preparing an index

CREATE INDEX ix_invoiceid ON staging(InvoiceId);

This index will be necessary for the transformational join coming up next.

Now, before making this data operational, let’s run a SELECT to get a preview and make sure the data is going to join together how we expect. Use N1QL’s NEST operation:

SELECT i.*, t AS Items
FROM staging AS i
NEST staging AS t ON KEY t.InvoiceId FOR i
WHERE i.type = 'invoice';

The result of this query should be three total root invoice documents.

Results of transformation with N1QL

The invoice items should now be nested into an "Items" array within their parent invoice (I collapsed them in the above screenshot for the sake of brevity).

Moving the data out of staging

Once you’ve verified this looks correct, the data can be moved over to the "operation" bucket using an INSERT command, which will just be a slight variation on the above SELECT command.

INSERT INTO operation (KEY k, VALUE v)
SELECT META(i).id AS k, { i.BillTo, i.InvoiceDate, i.InvoiceNum, "Items": t } AS v
FROM staging i
NEST staging t ON KEY t.InvoiceId FOR i
where i.type = 'invoice';

If you’re new to N1QL, there’s a couple things to point out here:

  • INSERT will always use KEY and VALUE. You don’t list all the fields in this clause, like you would in a relational database.

  • META(i).id is a way of accessing a document’s key

  • The literal JSON syntax being SELECTed AS v is a way to specify which fields you want to move over. Wildcards could be used here.

  • NEST is a type of join that will nest the data into an array instead of at the root level.

  • FOR i specifies the left hand side of the ON KEY join. This syntax is probably the most non-standard portion of N1QL, but the next major release of Couchbase Server will include "ANSI JOIN" functionality that will be a lot more natural to read and write.

After running this query, you should have 3 total documents in your 'operation' bucket representing 3 invoices.

Result from JSON data modeling transformation

You can delete/flush the staging bucket since it now contains stale data. Or you can keep it around for more experimentation.

Summary

Migrating data straight over to Couchbase Server can be as easy as importing via CSV and transforming with a few lines of N1QL. Doing the actual modeling and making decisions requires the most time and thought. Once you decide how to model, N1QL gives you the flexibility to transform from flat, scattered relational data into an aggregate-oriented document model.

More resources:

Feel free to contact me if you have any questions or need help. I’m @mgroves on Twitter. You can also ask questions on the Couchbase Forums. There are N1QL experts there who are very responsive and can help you write the N1QL to accommodate your JSON data modeling.

This is a repost that originally appeared on the Couchbase Blog: CSV tooling for migrating to Couchbase from Relational.

CSV (Comma-seperated values) is a file format that can be exported from a relational database (like Oracle or SQL Server). It can then be imported into Couchbase Server with the cbimport utility.

Note: cbimport comes with Couchbase Enterprise Edition. For Couchbase Community Edition, you can use the more limited cbtransfer tool or go with cbdocloader if JSON is an option.

A straight relational→CSV→Couchbase ETL probably isn’t going to be the complete solution for data migration. In a later post, I’ll write about data modeling decisions that you’ll have to consider. But it’s a starting point: consider this data as "staged".

Note: for this post, I’m using SQL Server and a Couchbase Server cluster, both installed locally. The steps will be similar for SQL Server, Oracle, MySQL, PostgreSQL, etc.

Export to CSV

The first thing you need to do is export to CSV. I have a relational database with two tables: Invoices and InvoiceItems.

Relational tables example

I’m going to export the data from these two tables into two CSV files. With SQL Server Management Studio, this can be done a number of different ways. You can use sqlcmd or bcp at the command line. Or you can use Powershell’s Invoke-Sqlcmd and pipe it through Export-Csv. You can also use the SQL Server Management Studio UI.

Export CSV from SQL Server Management Studio

Other relational databases will have command line utilities, UI tools, etc to export CSV.

Here is an example of a CSV export from a table called "Invoices":

Id,InvoiceNum,InvoiceDate,BillTo,ShipTo
1,ABC123,2018-01-15 00:00:00.000,Lynn Hess,"Herman Trisler, 4189 Oak Drive"
2,XYZ987,2017-06-23 00:00:00.000,Yvonne Pollak,"Clarence Burton, 1470 Cost Avenue"
3,FOO777,2018-01-02 00:00:00.000,Phillip Freeman,"Ronda Snell, 4685 Valley Lane"

Here’s an export from a related table called "InvoiceItems":

InvoiceId,Product,Quantity,Price
1,Tire,2,20.00
1,Steering Wheel,5,10.00
1,Engine Oil,10,15.00
1,Brake Pad,24,1000.00
2,Mouse pad,1,3.99
2,Mouse,1,14.99
2,Computer monitor,1,199.98
3,Cupcake,12,.99
3,Birthday candles,1,.99
3,Delivery,1,30.00

Load CSV into Couchbase

Let’s import these into a Couchbase bucket. I’ll assume you’ve already created an empty bucket named "staging".

First, let’s import invoices.csv.

Loading invoices

C:\Program Files\Couchbase\Server\bin\cbimport csv -c localhost -u Administrator -p password -b staging -d file://invoices.csv --generate-key invoice::%Id%

Note: with Linux/Mac, instead of C:\Program Files\Couchbase\Server\bin, the path will be different.

Let’s break this down:

  • cbimport: This is the command line utility you’re using

  • csv: We’re importing from a CSV file. You can also import from JSON files.

  • -c localhost: The location of your Couchbase Server cluster.

  • -u Administrator -p password: Credentials for your cluster. Hopefully you have more secure credentials than this example!

  • -b staging: The name of the Couchbase bucket you want the data to end up in

  • --generate-key invoice::%Id% The template that will be used to create unique keys in Couchbase. Each line of the CSV will correspond to a single document. Each document needs a unique key. I decided to use the primary key (integer) with a prefix indicating that it’s an invoice document.

The end result of importing a 3 line file is 3 documents:

CSV documents imported into Couchbase

At this point, the staging bucket only contains invoice documents, so you may want to perform transformations now. I may do this in later modeling examples, but for now let’s move on to the next file.

Loading invoice items

C:\Program Files\Couchbase\Server\bin\cbimport csv -c localhost -u Administrator -p password -b staging -d file://invoice_items.csv --generate-key invoiceitem::#UUID#

This is nearly identical to the last import. One difference is that it’s a new file (invoice_items.csv). But the most important difference is --generate—​key. These records only contain foreign keys, but each document in Couchbase must have a unique key. Ultimately, we may decide to embed these records into their parent Invoice documents. But for now I decided to use UUID to generate unique keys for the records.

The end result of importing this 10 line file is 10 more documents:

More CSV documents imported into Couchbase

What’s next?

Once you have a CSV file, it’s very easy to get data into Couchbase. However, this sort of direct translation is often not going to be enough on its own. I’ve explored some aspects of data modeling in a previous blog post on migrating from SQL Server, but I will revisit this Invoices example in a refresher blog post soon.

In the meantime, be sure to check out How Couchbase Beats Oracle for more information on why companies are replacing Oracle for certain use cases. And also take a look at the Moving from Relational to NoSQL: How to Get Started white paper.

If you have any questions or comments, please feel free to leave them here, contact me on Twitter @mgroves, or ask your question in the Couchbase Forums.

This is a repost that originally appeared on the Couchbase Blog: Test Drive: Trying Couchbase on Azure for Free.

Test Drive is an Azure feature that allows you to try out software in the cloud for free.

Previously, I wrote about how Getting started with Couchbase on Azure is easy and free. That post told you how to get started with $200 in Azure credit. With Test Drive, however, it’s even easier to get started, and you don’t have to use any of that $200 credit.

Test Drive on Azure Marketplace

The only thing you need before you start is a Microsoft account, which is free.

Create account for free

Couchbase Azure product page

There’s a big "Test Drive" button. Simply click to get started.

Launch test drive

Before you click to get started, notice that the Test Drive has a limited duration. You will get 3 hours to evaluate Couchbase Server.

Step by Step Evaluation Labs

We’ve prepared 4 step-by-step labs that you can follow to start evaluating Couchbase Server.

The test drive deployment is a fully functional Couchbase Server Enterprise cluster with all features enabled. The only limit you have is time.

Summary

Couchbase is proud to partner with Microsoft to make this test drive available to you.

If you are interested in learning more about Couchbase on Azure, please check out these other blog posts:

If you have any questions, comments, or suggestions, please leave a comment, email me [email protected], or tweet me @mgroves on Twitter.

2017 Year in Review

January 01, 2018 mgroves 0 Comments
Tags: podcast analytics couchbase seo

This has been a good year for Cross Cutting Concerns. I had some amazing guests on the podcast. The C# Advent was also way more successful than I anticipated. And I also made some important technical improvements to the site.

Google Analytics

I pulled the top 50 most viewed page from Google Analytics. The top 10 pages that were viewed the most this year are listed below. The C# Advent page got over double the views of the second place page. (The second place page is a post from 2014 on ASP Classic that baffles me with the amount of traffic it gets).

PagePageviews
The First C# Advent Calendar (2017) 5503
Using HTTP/Json endpoints in ASP Classic (2014) 2413
Command/Query Object pattern (2014) 2229
How I use Fluent Migrator (2014) 2010
ActionFilter in ASP.NET MVC - OnActionExecuting (2012) 1386
Parsing XML in ASP classic (2014) 974
Visual Studio Live Unit Testing: New to Visual Studio 2017 (2017, Couchbase Blog repost) 964
AOP vs decorator (2012) 954
SQL to JSON Data Modeling with Hackolade (2017, Couchbase Blog repost) 895

That's right, some of the most viewed pages on my site have to do with ASP Classic and XML. These are posts I did on a lark during a consulting gig back in 2014.

It always seems like the posts I do on a lark are the ones that take off. For instance, over at the Couchbase Blog, I believe I have the most viewed blog post of 2017 with Hyper-V: How to run Ubuntu (or any Linux) on Windows. This is a quick post I wrote as I was learning it myself, and it keeps raking in the views. It's #2 on Bing and Google when you search for "hyper-v ubuntu", so that helps.

I'm not just looking to raw views, though. I would like to have some measure of the quality of posts. If you know of any metrics that might help track that, please let me know. Google Analytics has a "Bounce Rate" which might be useful to look at. The 10 pages with the lowest bounce rate (out of the 50 most viewed pages) are all podcast posts!

PageBounce Rate
Podcast 060 - Dean Hume on Progressive Web Apps 67.21%
Podcast 056 - Jeremy Clark Convincing Your Boss on Unit Testing 70.65%
Podcast 053 - William Straub on Recruiting 71.76%
Podcast 052 - Angie Buccilli on Recruiting Secret Sauce 71.76%
Podcast 028 - Jeremy Miller on Marten 74.73%
Podcast 062 - Ted Neward on Akka 74.79%
Podcast 030 - Steven Murawski on Rust 76.47%
Podcast 041 - Eric Potter on C# Pattern Matching 81.54%
Podcast 061 - Eric Elliott on TDD 81.74%

I'm going to speculate and say that podcast pages have the lowest bounce rate because they a prominent and immediately useful call-to-action link (i.e. "listen to this podcast"). Excluding the podcasts, the top 10 links with the lowest bounce rate are:

PageBounce Rate
A Coryat scorekeeper for Jeopardy (2014) 82.35%
The First C# Advent Calendar (2017) 83.30%
Autocomplete multi-select of Geographical Places (2014) 85.39%
Lessons learned about Fluent Migrator (2014) 85.92%
AOP in JavaScript with jQuery (2012) 86.46%
Terminology: cross cutting concern (2012) 86.58%
Adventures in Yak Shaving: AsciiDoc with Visual Studio Code, Ruby, and Gem (2017) 87.42%
An Audit ActionFilter for ASP.NET MVC (2012) 88.13%
Using HTTP/Json endpoints in ASP Classic (2012) 88.95%

Once again, ASP Classic appears, but it's interesting to see a mostly different set of posts here. The average bounce rate for the top 50 most viewed pages is 90.80%. So these all beat the average (if that has any meaning).

Podcast Analytics

I've done a poor job of tracking podcast analytics since I started the podcast. I assumed I could grab download numbers from Azure (where I host my podcast files), but that turns out to be incredibly painful. I mainly do the podcast for fun and because I want to talk to enthusiatic tech people. But in my attempts to get sponsorship, I quickly realized that I needed a better solution for analytics. I signed up for PodTrac, but only after season 2 was finished. So these numbers aren't going to be very impressive. Season 3 onwards should provide more useful analytics. The top 10 are the 10 latest podcasts that I published (which makes sense).

The #1 most downloaded episode based solely on my better late than never PodTrac analytics is #061 - Eric Elliott on TDD.

Tech improvements

I've made some changes to Cross Cutting Concerns to hopefully improve SEO and your experience as a reader/listener.

  • HTTPS. I host this site on a shared website on Azure, so it's not exactly straightforward. But I used CloudFlare and followed this blog post from Troy Hunt.
  • HTML Meta. I added Twitter cards, tagging, description, and so forth. This makes my posts look a little nicer on Twitter and search engines, and hopefully will improve my search rankings. If you want to see what I did, hit CTRL+U/View Source right now and check out all the <meta />
  • If you clicked on some of the top 10 posts earlier, you might have noticed a new green box with a call to action. I've put this on some of my most popular posts to try and drive some additional engagement, page views, and podcast subscribers.
  • Image optimization. pngcrush, gifsicle, and jpegtran losslessly optimize images so they are smaller downloads. This will help with my Azure bill a little bit, and also improve page speed. It's currently a manual process, so sometimes I will forget.

What's next?

Based on the analytics I'm seeing so far, I'm going to:

Continue:

  • Reposting my Couchbase blog posts. These help drive traffic back to my employer's site and increase awareness of Couchbase. Which is my job!
  • Podcasting. I'm enjoying it, some people are listening to it.
  • Keep podcast episodes short. I get comments in person about how the length of the shows (10-15 minutes) is just right. I'm going to expand by a few minutes (see below), but episodes will not increase in length by much more than that (unless I feel like making a longer special episode).
  • C# Advent. I've heard that this helps people get traffic to their blogs. I'm definitely happy with it, and it helps the C# / Microsoft MVP community. I'll start recruiting writers a little earlier in November 2018.

Start:

  • Adding some more fun to podcast episodes. I've got an idea to add a little humor to each podcast. Stay tuned!
  • Podcast sponsorship. I've lined up a sponsor for 6 months of episodes. Let's see how it goes. I'd like to use this money to buy better equipment, pay for hosting, and maybe even purchase tokens of appreciation for guests.

Stop:

  • Tracking podcast downloads with Azure and FeedBurner.
  • People from using ASP Classic. Somehow.

This is a repost that originally appeared on the Couchbase Blog: Scaling Couchbase Server on Azure.

Scaling is one of Couchbase Server’s strengths. It’s easy to scale, and Couchbase’s architecture makes an efficient use of your scaling resources. In fact, when Couchbase customer Viber switched from Mongo to Couchbase, they cut the number of servers they needed in half.

This blog post is the third in a (loose) series of posts on Azure.

The first post showed you the benefits of serverless computing on Azure with Couchbase.

The second post showed a concrete example of creating a chatbot using Azure functions and Couchbase Server.

The previous post only used a cluster with a single node for demonstration purposes. Now suppose you’ve been in production for a while, and your chatbot is starting to get really popular. You need to scale up your Couchbase cluster. If you deployed Couchbase from the Azure Marketplace, this is a piece of cake. Long story short: you pretty much just move a slider. But this post will take you all the way through the details:

  1. Creating a brand new cluster with 3 nodes.

  2. Scaling the cluster up to 5 nodes.

  3. Scaling the cluster down to 4 nodes.

Create Couchbase Cluster on Azure

Assuming you have an Azure account, login to the portal. If you don’t yet, Getting Started with Azure is Easy and Free.

Once you’re logged in, click "+ New" and search for Couchbase Server in the marketplace. I’m using BYOL (bring your own license) for demonstration, but there is also an "Hourly Pricing" option that comes with silver support.

Couchbase in the Azure Marketplace

Once you select Couchbase, you’ll be taken through an Azure installation wizard. Click the "Create" button to get started.

Azure step 0 - create

Step 1 is the "Basics". Fill out the username and password you want for Couchbase, the name of a resource group, and a location (I chose North Central US because it is close to me geographically). Make sure to make a note of this information, as you’ll need it later.

Azure step 1 - basics

The next step is Couchbase Config. There are some recommended VM types to use. I went with DS1_V2 to keep this blog post cheap, but you probably want at least 4 cores and 4gb of RAM for your production environment. I also elected not to install any Sync Gateway Nodes, but if you plan to use Couchbase Mobile, you will need these too. I’m asking for a total of 3 nodes for Couchbase Server.

Azure step 2 - Couchbase Config

After this, step 3 is just a summary of the information you’ve entered.

Azure step 3 - summary

The last step is "buy". This shows you the terms. One "Create" button is all that remains.

Azure step 4 - buy

Now, Azure will go to work provisioning 3 VMs, installing Couchbase Server on them, and then creating a cluster. This will take a little bit of time. You’ll get an Azure notification when it’s done.

Azure provisioning VMs

You should have just enough time to get yourself a nice beverage.

Using your Couchbase Cluster

When Azure finishes with deployment, go look at "Resource groups" in the Azure portal. Find your resource group. Mine was called my_cb_resource_group.

Azure resource groups

Click on the resource group. Inside that resource group, you’ll see 4 things:

  • networksecuritygroups (these are firewall rules, essentially)

  • vnet (the network that all the resources in the group are on)

  • server (Couchbase Server instances)

  • syncgateway (Couchbase Sync Gateway instances. I didn’t ask for any, so this is an empty grouping)

Azure resource group drill down

First, click 'server', and then 'instances'. You should see 3 servers (or however many you provisioned).

Couchbase Servers provisioned

Next, click 'deployments'. You should see one for Couchbase listed. Click it for more information about the deployment.

Azure deployment

The next screen will tell you the URL that you need to get to the Couchbase Server UI (and Sync Gateway UI if you installed that). It should look something like: http://vm0.server-foobarbazqux.northcentralus.cloudapp.azure.com:8091.

Azure deployment info

Paste that URL into a browser. You will be taken to the Couchbase Server login screen. Use the credentials you specified earlier to login.

Couchbase Server login

After you login, click on 'servers'. You will see the three servers listed here. The URLs will match the deployments you see in the Azure portal.

Let’s put some data in this database! Go to Settings → Sample Buckets and load the 'travel-sample' bucket.

Load sample data

This sample data contains 31591 documents. When it’s done loading, go back to "servers". You can see how the 'items' (and replica items) are evenly distributed amongst the three servers. Each node in Couchbase can do both reads and writes, so this is not a master/slave or a read-only replica sets situation.

Couchbase servers

Scaling up

Now, let’s suppose your application is really taking off, and you need to scale up to provide additional capacity, storage, performance. Since we’re using Couchbase deployed from the Azure marketplace, this is even easier than usual. Go to the Azure portal, back to the resource group, and click "server" again. Now click "scaling"

Azure scaling

Next, you will see a slider that you can adjust to add more instances. Let’s bump it up to 5 total servers. Make sure to click "save".

Azure scaling slider

Now, go back to 'instances' again. Note: you may have to refresh the page. Azure doesn’t seem to want to update the stale page served to the browser on its own. You will now see server_3 and server_4 in "creating" status.

Scaling additional servers

You will need to wait for these to be deployed by Azure. In the meantime, you can go back over to the Couchbase Server UI and wait for them to appear there as well.

New Couchbase Server nodes

When adding new servers, the cluster must be rebalanced. The Azure deployment should attempt to do this automatically (but just in case it fails, you can trigger the rebalance manually too).

Couchbase rebalancing

During this rebalance period, the cluster is still accessible from your applications. There will be no downtime. After the rebalance is over, you can see that the # of items on each server has changed. It’s been redistributed (along with replicas).

Cluster after rebalance

That’s it. It’s pretty much just moving a slider and waiting a few minutes.

Scaling Down

At some point, you may want to scale down your cluster. Perhaps you need 5 servers during a certain part of the year, but you only need 3 for other parts, and you’d like to save some money on your Azure bill.

Once again, this is just a matter of adjusting the slider. However, it’s a good idea to scale down one server at a time to avoid any risk of data loss.

Scaling down slider

When you scale down, Azure will pick a VM to decommission. Couchbase Server can respond in one of two ways:

  • Default behavior is to simply indicate that a node is down. This could trigger an email alert. It will show as 'down' in the UI.

  • Auto-failover can be turned on. This means that once a node is down, the Couchbase cluster will automatically consider it 'failed', promote the replicas on other nodes, and rebalance the cluster.

I’m going to leave auto-failover off and show the default behavior.

First, the server will show a status of 'deleting' in the Azure portal.

Scaling down - deleting

Soon after, Couchbase will recognize that a node is not responsive. It will suggest failover to 'activate available replicas'.

Couchbase failing node

I’ll go ahead and do just that.

Manual failover

Once it’s removed from the cluster, you’ll need to trigger a 'rebalance'.

Manual rebalance

Summary and resources

Scaling a Couchbase cluster on Azure is simply a matter of using the slider.

If you’re scaling down, consider doing it one node at a time.

For more information, check out these resources:

If you have questions, please contact me on Twitter @mgroves or leave a comment.

Matthew D. Groves

About the Author

Matthew D. Groves lives in Central Ohio. He works remotely, loves to code, and is a Microsoft MVP.

Latest Comments

Twitter