github elastic/elasticsearch-net 2.0.0-alpha1

pre-release8 years ago

This marks the first release of our 2.0 branch (now master) with well over a 1000 commits since 1.7.1 (the currently last released NEST version in the 1.x range).

We are releasing it as alpha because we currently only support .NET 4.5 and up. For us to go from alpha to beta we ideally still have to finish

  • PCL/.NET CORE are in the works but not part of our build and CI infrastructure.
  • Finish our new documentation

Please note that even though this is an alpha, we have fully tested this against Elasticsearch 2.0 up to 2.1 (the current latest version).

Back to the drawing board

We took sometime to go back to the drawing board. NEST is quite old (started in 2010) and not all of the choices that have accumulated in the code base make sense anymore.
So we stepped back, formulized properly how we see the lifetime of a call, and worked off of that. Armed with the below diagram, we completely rewrote NEST's internals. The old TPL based code base is now replaced with async/await, we have a much saner approach to exceptions and errors, and we expose enough information as an audit trail so you do not have to ever guess what went down during a call.

pipeline

Our internals now also reflect this:

IElasticClient exposes all the Elasticsearch API endpoints e.g client.Search this calls into ITransport's 2 methods Request and RequestAsync the default ITransport then uses the passed in IRequestPipelineFactory to create an RequestPipeline which implements IPipeline.

This pipeline now handles all of the failover/sniffing/pinging logic and directly reflects the flow diagram.

We also simplified IConnection down just 2 methods. This means the outer edges are clean (ITransport and IConnection) and implementing your own should be really really simple. All of these (and also IMemoryStreamProvider and IDateTimeProvider) can be injected on the constructor of the client.

Test Framework

Another huge endeavour is the rework of our test framework. NEST 1.x was always well tested but used 5 different test projects and 5 years worth of changing our minds as to how best to write tests and assertions.
Thus becoming a big hodgepodge of nunit assertions, fluent assertions, FakeItEasy, Moq combined with several different ways to compare json with object graphs and vice-versa. Trying to write a new test quickly became cumbersome because there was no clear cut way how best to write said test.

So the first thing we did as part of our 2.0 branch was to completely delete all of our tests. This act of insanity gave us carte blanche during our rewrite.

As of 2.0 we have one test project Tests and all of the tests are written in such a way that they can be run in unit test mode and integration test mode. Write once, run differently.
All the API endpoint tests test all 4 variations, 2 DSL's (fluent and object initializer) + sync and async. We also test all of the moving parts of Elasticsearch DSL (Aggregations, Sorting, IndexSettings, etc...) in the same way.

We also introduced a thing we dubbed Literate Testing to allow us to write tests in a more story telling form with the comments serving as the asciidoc for our documentation while using the Roslyn compiler to pick the interesting bits of code.
This gives us the benefit of always compiling our documentation but also having one place where we document, test and assert how a piece of code is supposed to work.

Another huge component of our testing framework is the Virtual Cluster that allows us to write tests for any situation and how we expect the client to behave.

/** we set up a 10 node cluster with a global request time out of 20 seconds.
* Each call on a node takes 10 seconds. So we can only try this call on 2 nodes
* before the max request time out kills the client call.
*/
var audit = new Auditor(() => Framework.Cluster
  .Nodes(10)
  .ClientCalls(r => r.FailAlways().Takes(TimeSpan.FromSeconds(10)))
  .ClientCalls(r => r.OnPort(9209).SucceedAlways())
  .StaticConnectionPool()
  .Settings(s => s.DisablePing().RequestTimeout(TimeSpan.FromSeconds(20)))
);

audit = await audit.TraceCalls(
  new ClientCall {
    { BadResponse, 9200 }, //10 seconds
    { BadResponse, 9201 }, //20 seconds
    { MaxTimeoutReached }
  },
  /**
  * On the second client call we specify a request timeout override to 80 seconds
  * We should now see more nodes being tried.
  */
  new ClientCall(r => r.RequestTimeout(TimeSpan.FromSeconds(80)))
  {
    { BadResponse, 9203 }, //10 seconds
    { BadResponse, 9204 }, //20 seconds
    { BadResponse, 9205 }, //30 seconds
    { BadResponse, 9206 }, //40 seconds
    { BadResponse, 9207 }, //50 seconds
    { BadResponse, 9208 }, //60 seconds
    { HealthyResponse, 9209 },
  }
);

This showcases the Virtual Cluster tests combined with Literate Tests and the extensive audit trail information available on each response (or exception).

I'm pleased to say we are back at a decent coverage rate (60%) and we'll continue to bump that.

Exception handling

Another big change in NEST 2.0 is how we deal with exceptions.

In NEST 1.x, the client threw a multitude of exceptions: MaxRetryException, ElasticsearchAuthException, ElasticsearchServerException, DslException, etc.. This made it challenging for users to handle exceptions/invalid responses, and understand the root cause of errors. On top of that, it depended on what kind of IConnectionPool was injected to contain maximum backwards compaitibility with NEST 0.x

In NEST 2.x, exceptions are much more deterministic. The former ThrowOnElasticsearchServerExceptions() setting has been replaced with simply ThrowExceptions(), which determinse whether the client should ever throw an exception or not (client side and server exceptions). Furthermore, the types of exceptions have been reduced and simplified-the client will now only throw three types of exceptions:

ElasticsearchClientException: These are known exceptions, either an exception that occurred in the request pipeline(such as max retries or timeout reached, bad authentication, etc...) or Elasticsearch itself returned an error (could not parse the request, bad query, missing field, etc...). If it is an Elasticsearch error, the ServerError property on the response will contain the the actual error that was returned. The inner exception will always contain the root causing exception.

UnexpectedElasticsearchClientException: These are unknown exceptions, for instance a response from Elasticsearch not properly deserialized. These are usually bugs and should be reported. This exception also inherits from ElasticsearchClientException so an additional catch block isn't necessary, but can be helpful in distinguishing between the two.

Development time exceptions: These are CLR exceptions like ArgumentException, NullArgumentException etc., that are thrown when an API in the client is misused. These should not be handled as you want to know about them during development.

Breaking Changes

Even though a lot of work went into the interior, the exterior did not escape unscathed! On top of the many breaking changes that Elasticsearch 2.0 introduces, there are more then a few NEST 2.0 introduces.
We revalidated all the request and response domain objects against Elasticsearch 2.0.

We will do our best to compile a list when NEST 2.0 GA hits. If we moved your cheese to a spot you can no longer find it then please open an issue and we'll be more then happy to help locate it.

Elasticsearch 2.x support

NEST 2.0 supports all the new features from Elasticsearch 2.0 including pipeline aggregations. Here we'll just highlight a couple features that are reflected in NEST changes

Removal of filters

NEST 2.0 reflects Elasticsearch 2.0 move of filters and no longer has filter constructs in its code base

Filtered query deprecation

With the removal of filters NEST has added a special construct in its Query DSL to easily create a bool query with a filter clause

.Query(q=> +q.Term(p=>p.Name, "NEST"))

Note the + this will wrap the term query inside a bool query's filter clause

You can even combine this with !

.Query(q=> !+q.Term(p=>p.Name, "NEST"))

This will wrap the term query inside a bool filter and that bool inside a bool must_not, obviously this also works for the object initializer syntax

!+new TermQuery {}

Attribute based mapping

The single ElasticPropertyAttribute has been broken up into individual attributes per property type.

For instance, the following:

[ElasticType(Name = "othername", IdProperty = "MyId")]
public class Foo
{
  [ElasticProperty(Type = FieldType.String)]
  public Guid MyId { get; set; }

  [ElasticProperty(Type = FieldType.String)]
  public string Name { get; set; }

  [ElasticProperty(Type = FieldType.String, Analyzer = "myanalyzer", TermVector = TermVectorOption.WithOffsets)]
  public string Description { get; set; }

  [ElasticProperty(Type = FieldType.Date, Format = "mmmddyyyy")]
  public DateTime Date { get; set; }

  [ElasticProperty(Type = FieldType.Integer, Coerce = true)]
  public int Number { get; set; }

  [ElasticProperty(Type = FieldType.Nested, IncludeInParent = true)]
  public List<Bar> Bars { get; set; }
}

becomes

[ElasticsearchType(Name = "othername", IdProperty = "MyId")]
public class Foo
{
  [String]
  public Guid MyId { get; set; }

  [String]
  public string Name { get; set; }

  [String(Analyzer = "myanalyzer", TermVector = TermVectorOption.WithOffsets)]
  public string Description { get; set; }

  [Date(Format = "mmddyyyy")]
  public DateTime Date { get; set; }

  [Number(NumberType.Integer, Coerce = true, DocValues = true)]
  public int Number { get; set; }

  [Nested(IncludeInParent = true)]
  public List<Bar> Bars { get; set; }
}

Aside from the more simple and cleaner API, this allows each attribute to only reflect the options that are available for the particular type, instead of exposing options that may not be relevant (as ElasticPropertyAttribute currently does).

For more details, see #1520.

C# 6 support

NEST's codebase has been largely rewritten to take advantage of all the cool new c# features making almost all the fluent code one liners.
We also expose the static Infer class which makes a great static using target.

using static Nest.Infer;

//later..

Field<Project>(p=>p.Name);
Index<Project>();
Indices<Project>().And<Developer>();

Feedback

Please let us know if you hit into a snag using these new bits, nothings to big or to small. We are looking to move fast to a NEST 2.0 GA and would hate to miss anything.

Don't miss a new elasticsearch-net release

NewReleases is sending notifications on new releases.